This is speculation. While I agree we haven’t seen Apple’s code and can only read the high-level white papers describing their system, from what I’ve read I don’t see a way for this type of information to be included and/or passed on to the server.
For example, Apple’s white paper shows that the way an image is reported as a positive hit is by successful decryption. In other words, the image’s status (positive or negative as matching the database) is used in the encryption key (so to speak) so when the server tries to decrypt it, only positive hits are successful. There’s no flag or tag on the file that tells it is CSAM or something else. It’s just binary – successful or not.
Because of the second layer of encryption, nothing more can be deduced about the image until the threshold quantity is exceeded. So until then, there is no real information. Even if additional information about the image type – “CSAM”, “terrorist”, “activist”, “jaywalker” – is included, it would have to be included within the voucher which is inside the second layer of encryption which can only be seen once the threshold is exceeded.
That to me seems like a very poor system for finding terrorists or dissidents or others, as everything is still lumped together into one “exceeded the threshold” group of images.
Then you get into the issue of accuracy. Presumably Apple included the threshold system to prevent false positives and only flag egregious users (i.e. an account with a lot of CSAM). We don’t know what that threshold level is (Apple isn’t saying), but clearly it must be a certain amount in order for Apple to calculation their “1 in a trillion” odds of false positives.
Now if the system is modified to start watching for additional content (terrorists and jaywalkers), and those results are mixed into the group that exceeded the threshold, wouldn’t that interfere with Apple’s “1 in a trillion” calculation?
For example, say 10 is the threshold. Someone has to have at least 11 CSAM images to be reported. Apple has decided that the odds of someone with 9 images is lower than “1 in a trillion” and too risky (as some of those might be false positives), so they set the threshold higher.
But if the threshold is exceeded and there are 3 terrorist hits, 3 activists, 2 jaywalker, and 3 CSAM, none of those categories have enough hits to ensure an accurate report (you need at least 11 of each to ensure the 1 in a trillion odds). Yes, a human could look at the images and decide, but you’ve basically ruined the whole threshold approach by mixing in multiple types of content reporting into one system.
Now some countries wouldn’t care about threshold, of course, and would be willing to crack down on an individual just based on suspicion, but this seems like a weird system for Apple to create to let it be abused like that. From the description, it seems Apple only wants this to apply to the most egregious users, and the way that it’s designed that would apply to whatever content was being searched for (CSAM or something else).
If Apple just wanted to include a back door, there would be much easier ways of doing it. This system seems to me to be deliberately designed to be extremely limited – and any of the doomsday scenarios panicky people describe seem really unlikely. There are probably 100 other places we’re trusting Apple to do what they say that are more vulnerable than this supposed “backdoor.”