Back before iPhone X was released I had a chance to talk with Apple about it and one of the questions I asked was how Animoji — the incredibly cute animated emoji system built into iMessages — worked. The answer I got was that the TrueDepth camera system captures a crude depth mask with the IR system and then, in part using the Neural Engine Block on the A11 Bionic processor, persistently tracks and matches facial movement and expressions with the RGB camera.

I didn't think much else about it at the time because that answer not only fit exactly with the public documentation available to date on ARKit, the augmented reality — and, with iPhone X, facial tracking and expression matching — framework Apple provides to developers, but it fit my observations of the light requirements and speed-to-tracking I experienced in the demo area.


Prefer to listen rather than read? Hit play on the podcast version:

Get Vector in your inbox:

Subscribe for more: Apple Podcasts | Overcast | Pocket Casts | Castro | RSS


Since then, some confusion has cropped up about whether or not Animoji really requires iPhone X-specific hardware. It does, but it's easy to see how some people have come to think otherwise. After all, you can cover the IR system and it keeps working but, if you cover the RGB camera, it stops.

The reason for the misconception comes from the implementation: The IR system only (currently) fires periodically to create and update the depth mask. The RGB camera has to capture persistently to track movements and match expressions. In other words, cover the IR system and the depth mask will simply stop updating and likely, over time, degrade. Cover the RGB, and the tracking and matching stops dead.

Snapchat might make it easier to picture: Snapchat has had popular face matching filters for a long time. With iPhone X and the IR system, it can track and match them much, much better. Cover TrueDepth, though, and you get the same old ok tracking and matching you always had.

Almost a decade ago iPhone 3GS shipped with video recording. That feature wasn't back-ported to iPhone 3G. That upset some people. Eventually, it was McGyver'd on and it could only capture at 15 fps. Some people may not have cared. Others would have cared very much. Apple set 30 fps as the target and wouldn't settle for anything less.

Apple could, others have, and certainly many more will, create Animoji-like experiences for older iPhones but absent the TrueDepth camera system they wouldn't benefit from the more precise face tracking and expression-matching ARKit has to offer. And that's not just the bar Apple set for Animoji — and the bar everyone will judge the team on — but the system the company built Animoji to show off to begin with.

In other words, Apple could have made a sloppy version for iPhone 8 (which lacks TrueDepth but shares the A11 Bionic) or a crappy version for iPhone 7 (which lacks both), but the company would likely be blasted for the poor performance by the same people blasting them for not having it at all right now.

And if Apple later updates Animoji in a way that makes it even more dependent on the depth map, those updates would simply not be possible on older hardware. And that would prove even more annoying.

It's better to think of Animoji as less of a fun iMessage feature and more of an engaging tech demo to show developers and customers what the TrueDepth camera, A11 Bionic, and ARKit can really do.

And, in that sense, Animoji is just a beginning, not an ending.

iPhone X + iPhone 8

Main