No, Apple's Machine Learning Engine can't surface your iPhone's secrets

Core ML is Apple's framework for machine learning. It lets developers easily integrate artificial intelligence models from a wide variety of formats and use them to do things like computer vision, natural language, and pattern recognition. It does all this on-device, so your data doesn't have to be harvested and stored on someone else's cloud first. That's great for privacy and security, but it doesn't prevent sensationalism:

Wired, in an article I'd argue should never have made it into publication:

With this advance comes a lot of personal data crunching, though, and some security researchers worry that Core ML could cough up more information than you might expect—to apps that you'd rather not have it.

It's less likely some people worry and more likely they saw a new technology and figured they could stick it and Apple in a headline and get some attention — at the expense of consumers and readers.

"The key issue with using Core ML in an app from a privacy perspective is that it makes the App Store screening process even harder than for regular, non-ML apps," says Suman Jana, a security and privacy researcher at Columbia University, who studies machine learning framework analysis and vetting. "Most of the machine learning models are not human-interpretable, and are hard to test for different corner cases. For example, it's hard to tell during App Store screening whether a Core ML model can accidentally or willingly leak or steal sensitive data."

There's no data that an app can access through Core ML that it couldn't already access directly. From a privacy perspective, there's nothing harder in the screening process either. The app has to declare the entitlements it wants, Core ML or no Core ML.

This reads like complete FUD to me: Fear, uncertainty, and doubt designed to get attention and without any factual basis.

The Core ML platform offers supervised learning algorithms, pre-trained to be able to identify, or "see," certain features in new data. Core ML algorithms prep by working through a ton of examples (usually millions of data points) to build up a framework. They then use this context to go through, say, your Photo Stream and actually "look at" the photos to find those that include dogs or surfboards or pictures of your driver's license you took three years ago for a job application. It can be almost anything.

It could be everything. Core ML could make it more efficient for an app to find very specific data patterns to extract but, at that point, an app could extract that data and all data anyway.

Theoretically, finding and extracting a few photos might be easier to hide than simply pulling a large number or all photos. So could trickle uploading over time. Or based on specific metadata. Or any other sorting vector.

Just as theoretically, ML and neural networks could be used to detect and combat these kinds of attacks as well.

For an example of where that could go wrong, thing of a photo filter or editing app that you might grant access to your albums. With that access secured, an app with bad intentions could provide its stated service, while also using Core ML to ascertain what products appear in your photos, or what activities you seem to enjoy, and then go on to use that information for targeted advertising.

Also nothing unique to Core ML. Smart spyware would try to convince you to give it all your photos right up front. That way it wouldn't be limited to preconceived models or be at risk of removal or restriction. It would simply harvest all your data and then run whatever server-side ML it wanted to, whenever it wanted to.

That's the way Google, Facebook, Instagram, and similar photo services that run targeted ads against those services already work.

Attackers with permission to access a user's photos could have found a way to sort through them before, but machine learning tools like Core ML—or Google's similar TensorFlow Mobile—could make it quick and easy to surface sensitive data instead of requiring laborious human sorting.

I get putting Apple in a headline garners more attention but including Google's TensorFlow Mobile only once and only as an aside is curious.

"I suppose CoreML could be abused, but as it stands apps can already get full photo access," says Will Strafach, an iOS security researcher and the president of Sudo Security Group. "So if they wanted to grab and upload your full photo library, that is already possible if permission is granted."

Will is smart. It's great that Wired went to him for a quote and that it was included. It's disappointing that Will's quote was included so far down and unfortunate for all involved that it didn't get Wired to reconsider the piece entirely.

The bottom line here is that, while machine learning could theoretically be used to target specific data, it could only be used in situations where all data is already vulnerable.

Beyond that, Core ML is an enabling technology that can help make computing better and more accessible for everyone, including and especially those who need it the most.

By sensationalizing Core ML — and Machine Learning in general — it makes people already fearful or worried about new technologies even less likely to use and benefit from them. And that's a real shame.

Rene Ritchie
Contributor

Rene Ritchie is one of the most respected Apple analysts in the business, reaching a combined audience of over 40 million readers a month. His YouTube channel, Vector, has over 90 thousand subscribers and 14 million views and his podcasts, including Debug, have been downloaded over 20 million times. He also regularly co-hosts MacBreak Weekly for the TWiT network and co-hosted CES Live! and Talk Mobile. Based in Montreal, Rene is a former director of product marketing, web developer, and graphic designer. He's authored several books and appeared on numerous television and radio segments to discuss Apple and the technology industry. When not working, he likes to cook, grapple, and spend time with his friends and family.

9 Comments
  • It's iPhone X time, everything goes!!! Wired showing they are Google's puppet! Waiting for VAVA defense...
  • I made you a tinfoil hat
  • So when one grants photo library access to an app—say, to allow it to edit one particular photo—are you saying that this automatically gives the app full access to the entire library, and that the app could upload either the whole library of associated metadata without explicit permission from the user? I know this has nothing to do with Core ML, but why would Apple create a framework that allows such a thing to happen?
  • When one grants photo library access to an app, that _is_ giving explicit permission to your photos. So the app can do anything they want with them, including uploading them to a server. There is no mechanism for granting permission to one photo at a time. If you aren't comfortable with that, you may be able to copy a photo to your iCloud drive, and open it that way, but it would depend on the app in question.
  • > There is no mechanism for granting permission to one photo at a time. There is a mechanism for that, and probably Apple will force the adoption of that mechanism if the need arrises.
  • Because having the phone ask for access on every photo is where we put security in front of usability. I think the current system is perfectly fine, I don't see what's wrong with it. Any apps can abuse any kind of permissions, it's like saying we should only use plastic kitchen knifes so no one stabs anyone.
  • Here is a thought: most (all?) Apps that require access to photos should get that access exclusively through a procedure call that launches the usual file -open process which requires MANUAL manipulation of the iPhone interface to select a picture. Why would such an app require -- or expect or desire -- any more access to photos such as the ability to grab and upload all of them??? The current approach to security is designed to provide a mere sense of security... the proverbial false sense. It is a pervasive misunderstanding that partially explains the shortcomings in the wired story. The wired author is onto a story... he/she just hasn't quite figured out the angle yet.
  • I see what you mean, I think there could be a separate permission for allowing an app to "upload media". So you'd allow the app as normal to access photos, and upon uploading an image, you'd get "Do you wish to give permission to upload photos?". This permission would get triggered at any point media is requested to be sent to any kind of server. There's no problem with the app doing whatever it wants to images so long as it's contained within in the app, but once it starts sending that image to a server, that's where the concerns are raised.
  • Sadly, the author turned a PSA on photo permissions into a FUD on [blank] machine learning. [blank] = CoreML What we really need is to be able to see/control what websites apps connect to. (Think something similar to cellular controls).