New CSAM Detection Details Emerge Following Craig Federighi Interview

From the article: “Stern extracted substantially more detail from Federighi about what Apple is and isn’t scanning and how CSAM will be recognized and reported.”

Do you have a better word for evaluating every image before it is uploaded, running it through a program to evaluate if it matches other images?

They will never have the images that NCEMC has. It is illegal for the NCEMC to give them to any other organization. So Apple actually can’t be scanning those images. I can’t think of any other word to use besides scanning, but Adam objects to it. Regardless, if you have iCloud turned on for photos, it will be checking every one of your images.

I should have said hashes of images rather than images. The images Apple will be reviewing on iPhones will be morphed into “Neural Hashes.” I didn’t mention it when I dashed out my previous post because I assumed people contributing to this lengthy discussion would know what I was referring to. And Neural hashing is a far, far better thing to do.

I’ve been leaning toward “matching” and trying to get the word “hash” in wherever I can. That’s because what Apple’s doing is creating a hash based on the image that’s being uploaded to iCloud Photos, and then comparing that hash against a database of hashes.

The problem with “scanning” is that it’s used in other contexts to mean something quite different, and is thus causing confusion. It’s not like Apple is looking at the content of every image being uploaded to iCloud Photos and saying “The machine learning algorithm says that’s CSAM.” unlike how it looks at the content of every image to determine whether there’s a penguin or ship in it.

That’s correct, Apple doesn’t have those images. Apple has worked with NCMEC to create a database of hashes to those images. That’s what’s being compared—the hashes for photos being uploaded to iCloud Photos against the hashes for known illegal CSAM that are the intersection of the NCMEC database and at least one other similar database under the jurisdiction of another government.

2 Likes

At the risk of speaking for @ace:

  • There’s a difference between scanning for a particular kind of content (e.g. offensive images) and scanning for a specific set of well-known images without any kind of content identicfication.

  • Apple doesn’t have the NCMEC files. But they have the NeuralHash values for these files, which NCMEC computes and sends to Apple. The algorithms (both device-side and server-side) use this database to determine if an image is or is not one of the NCMEC files.

Apple wrote a technical description (which I attempted to summarize) that explains exactly what is being done.

A superbrief summary is:

  • Apple generates a “blinded hash” database from all of the NeuralHashes provided by NCMEC. This database is stored on iCloud servers and on all iPhones equipped with the software (distributed via iOS updates).
  • Your phone, as a part of the iCloud upload process, computes the NeuralHash of each image and generates a derivative image (a low-res version of the original) and encrypts them using two different algorithms (PSI and TSS). The contents of the blinded hash is used to generate the key used by the PSI algorithm. The encrypted data is called a security voucher and is uploaded with the image.
  • The nature of PSI is that the security voucher can not be decrypted unless both the image’s NeuralHash belongs to the blinded hash (meaning it’s in the CSAM database) and Apple’s secret key is known. Since your phone doesn’t have the secret key, it can not know if the image matches the database or not.
  • The nature of the TSS algorithm is that, after decrypting the PSI layer, the actual content can’t be viewed unless a threshold number of PSI-matching images have also been uploaded. Craig has said that the threshold is 30.
  • In order to prevent Apple from knowing how many matches have been uploaded before the threshold has been crossed, your phone also uploads synthetic vouchers, which match the PSI algorithm but always fail the TSS algorithm.

The upshot of all this is:

  • The system can only detect NCMEC’s images (or basic transformations of them, like color-space changes, cropping, rotation, resizing), not other images, even if they are of similar subject matter.
  • Your phone doesn’t know if any images match the database
  • Apple doesn’t know if any images match the database until 30 matching images have been uploaded. Once there are 30, Apple can view the derivative images of the matches, but not of any other image.

And, has also been mentioned several times, Apple will have humans review these derivative images, to make sure they really are CSAM and not false-positive matches, before law enforcement is notified.

2 Likes

As far as auditing this system goes, MIT Technology Review puts out that Apple is actually trying to prevent people from being able to do so:

Wow, Apple Legal and Apple PR clearly don’t work anywhere near each other on the Apple campus.

While it’s not inconceivable that Apple has a legitimate copyright concern over what Corellium is doing, the level of tone-deafness in pursuing this case at this particular point in time is astonishing.

1 Like

This lawsuit began long before the CSAM detection announcement. The article even says that the suit was filed in 2019.

Apple has never allowed anyone to run iOS on a non-Apple device, which is the sole purpose for Corellium’s product. The fact that Corellium is now issuing press releases designed to use the CSAM detection announcement in order to sway public opinion is just a sleazy way to get the public to take sides on a lawsuit that most people really don’t care about.

1 Like

Many people care about real security research being possible on iOS, and I’d even go so far as stating that you should probably care.

Edit: In case it’s not clear, Corellium’s product actually allows what Apple claims we can do. It doesn’t matter when Apple started suing them for allowing security research on iOS. It matters whether Apple is telling the truth about being able to audit the system. If Apple has been working since 2019 to prevent auditing iOS, then that only makes it worse. (In fact, Apple has been working against it for much longer than that.)

This is not a press release. This is an action Apple just took. They decided to appeal the decision that allowed security researchers to audit iOS and the CSAM system.

From the article:

“Apple is exaggerating a researcher’s ability to examine the system as a whole,” says David Thiel, chief technology officer at Stanford’s Internet Observatory.

Apple doesn’t let you run macOS on a PC. Being a security researcher doesn’t grant you an exception to the license, even though the court of public opinion might disagree.

This is no different. The CSAM discussion is a distraction and has nothing to due with the merits of the suit.

1 Like

In other words, it doesn’t matter if Apple is lying about the system being auditable?

Edit: And I never said it had anything to do with the merits of the suit. My contention is that Apple does not want or allow people to audit iOS, so contrary to their claims, the CSAM system is unlikely to be really auditable.

Of course Apple did; the initial judgement left an important door wide open:

“ Although the fair use ruling will help Corellium breathe a sigh of relief, an open question remains as to whether Corellium’s fair use victory is a hollow one because the court did not dismiss Apple’s claim that Corellium circumvented Apple’s own security measures to protect iOS code in violation of the Digital Millenium Copyright Act (DMCA).”

The timing is right in this instance. My guess is they’ve been holding it in their pockets got the right moment once the negotiations to buy Corellium fell through.

The system hasn’t even been released. So how do you know that the only possible way to audit it will be to run it in Corellium’s emulator?

Maybe we should wait for other security researchers - those not in the middle of a 2+ year lawsuit with Apple - to get a chance to review the code before jumping to the conclusion that Apple is lying about the system being auditable.

1 Like

Umm… I quoted such a researcher. As did the article.

But the open door issue was not directly addressed in this discussion, and it is an important consideration.

The article quoted a researcher working for Corellium. Not exactly an objective source, since his company is the subject of ongoing litigation with Apple.

Again, let’s wait and see what is actually released. Apple said it would be auditable by third parties. This statement implies that they will be publishing a procedure. I’d like to see what it is before concluding that there really isn’t one.

If, after releasing this system, they fail to publish any procedure, or if the procedure is only available to people who have signed NDAs, then I will join you in your condemnation of Apple, but I’m going to wait until then before I come to such a decision.

1 Like

Are you alleging that “David Thiel, chief technology officer at Stanford’s Internet Observatory, the author of a book called iOS Application Security” is actually a researcher for Corellium?

I’m sorry if I’m not keeping up, but I don’t know what you’re referring to.

I couldn’t access the article; I had already hit the limit of 2 freebies for the site. Closed doors such as this is why I try to quote specifics in my posts.

Thanks, Adam. The books make great gifts for any occasion: family, friends, the person you just met on the street.

I think it’s the wrong question – it’s not why would China do it, it’s why wouldn’t China do it. Taking control that way ensures that they’re demonstrating their power, they’re intimidating the Chinese population, and they’re showing Apple who is in charge. It’s the default option to interfere.

But to answer it the way it was asked, a couple of reasons, I think, would motivate them:

  1. They’re worried that they’re missing something in their iCloud scanning – this will be a way to get a look from a different direction.
  2. In general, intelligence gathering is always about multiple sources of information and access. One source can be unreliable or incomplete – having lots of ways to access things is useful. Even the oddest and most convoluted way of doing things can be useful if it helps verify/dispute something else (in 1982, the Russians were worried that a large NATO military exercise was a cover for a preemptive nuclear strike on them. Among other ways of gathering information, they called blood banks in the United States to see if America was stockpiling blood in preparation for a mass casualty event. That’s convoluted).
  3. Apple has deeper knowledge of how to surveil it’s own phones than anyone else. Using a tool they created gives China that advantage. No one better to search a house than the homeowner.

(That photo of me may be a few years out of date.)

1 Like

Apple has now delayed this feature and promises to make improvements.

2 Likes

One wonders whether this delay will affect the planned release of Apple’s other announced backdoor, the one that would screen opted-in youngsters’ Messages accounts for inappropriate sexual material.

And almost an afterthought is this nugget from 9to5 Mac:
“It was also revealed through this process that Apple already scans iCloud Mail for CSAM”

1 Like