What will Apple's next-generation Siri voice assistant offer come WWDC 2017?
Siri… is complicated. Originally available on the App Store, Apple bought Siri, re-imagined the service, and launched it as one of the tentpole new features for iPhone 4s. Over time, Siri expanded to iPad and iPod touch, Apple Watch and Apple TV and, most recently, the Mac.
Apple has also added new features, from navigation to dinner reservations, Siri suggestions, HomeKit control, and just last year, an interface for developers. Since Siri lives on servers, Apple's also been quietly updating it, adding everything from entirely new engines to holiday or event-themed jokes.
There aren't many rumors to round up, so I'll list some long-standing wishes as well.
What do past Siri updates tell us about future Siri updates?
If you believe previous behavior is the best indicator of expected behavior, here's what we've gotten so far:
- 2011: Siri launches on iPhone 4s as part of iOS 5
- 2012: Siri launches on iPad and iPod touch as part of iOS 6
- 2015: Siri gets "Hey, Siri!" voice activation on iPhone, Siri launches on Apple TV as part of tvOS 9, Siri Suggestions begin to propagate as part of iOS 9
- 2016: Siri launches on Mac as part of macOS 10.12, SiriKit API made available to developers as part of iOS 10
So, really, it tells us nothing. Apple finally has Siri on all its platforms, but it also has staggering new competition from Amazon in the form of the licensed Alexa, Google with its new Assistant, and even Samsung with its stumbling but audacious Bixby and upcoming Viv integration — the new system from the creators of Siri.
Increased contextual awareness
The only major Siri rumor coming into WWDC 2017 is an improved contextual awareness engine. Siri can already do sequential inference — if you ask for the capital of Germany, Siri will tell you Berlin, and if you then ask for the population, Siri understands you're still asking about Berlin and gives you the correct answer. But Siri's contextual awareness is both limited to the current conversation and fragile enough that it often breaks or fails.
According to The Verifier, that may change.
According to information received directly from the development teams based in Israel and the US, Siri will be upgraded at the level of its artificial intelligence According to the information, Siri will continue to fulfill the voice commands known today, but like Bixby recently introduced by Samsung , You will be able to learn the user's usage habits and offer different action options depending on the context of the content. [sic]
Siri already has limited knowledge of what you're doing. For example, if you say "Siri, remember this", Siri will set a Reminder bookmark for your current web page, position in a podcast, iMessage conversation, etc.
With Suggestions, Siri also began using location, time, and behavioral patterns to predict which apps you may want to access at any given moment. For example, if you check Twitter first thing in the morning or text your significant other as soon as you leave work.
What this suggests is that Siri's contextual awareness will start to parse not just an ongoing conversation but ongoing and perhaps even predicted activity as well. In other words, the separate will become one.
More of 'this'
Siri's ability to use activity markers, originally introduced as part of Apple's Continuity feature for iOS and macOS, to set reminders for web pages, podcast positions, iMessage conversations, and more, is triggered by saying "Remember this*.
It works so well and is so compelling, I'd love to see it used far more often and for far more things:
- "Siri, read this" should trigger the screen reader for whatever text is currently being displayed, including iBooks, web pages, messages, and more.
- "Siri, send this" should trigger the share sheet and offer to send the current content via iMessage, Mail, and any app with a compatible share extension.
- "Siri, what is this?" should trigger the help system in the OS or app to explain what you're interacting with and how it can best be used.
- "Siri, print this" should pull up a PDF version of whatever is currently on screen and either let you save it to iCloud or a document provider or AirPrint it.
Once Siri truly understands "this" to mean current activity, the productivity possibilities are endless.
Type to Siri
I've been asking for — and filing radars for — a text-based Siri since it launched. Back then, it was simply a convenience. In some situations it's impolite or impossible to talk to your device. In those cases, being able to quick type a query or command is extremely powerful.
The kicker is that Siri has been able to take text-based input for years. You simply had to speak first, edit, and resubmit second.
Since then, though, we've seen the rise of bots. (It's like the rise of the machines in Matrix or Terminator only less deadly and more annoying.) With bots, messaging services have essentially become text-based Siri.
Google is doing it. Facebook is doing it. Most everyone is doing it except for the company that had both iMessage and Siri before any of its competitors — Apple.
It might seem trivial, but not having to switch contexts when you're engaged in an activity is profoundly empowering. iMessage apps were a start, but they're dumb. Siri is smart but not integrated into iMessage, Notes, or anywhere else.
If Apple can figure out a way that unites those capabilities while still protecting privacy and maintaining encryption — keep it local, for example — it'd be a huge win for everyone.
Well, except maybe Amazon and Google.
Putting the assist in assistant
One of the theoretical features in Samsung's upcoming Bixby system is true voice interface. Not voice command or control, where you tell the assistant to do a task and it does it, but the ability to say which buttons you want pressed and text you want entered.
It may seem like voice command and control supersedes simple voice interaction, but it doesn't. At least not yet. The former can still only do a small subset of everything that's possible with the operating system and apps. If you unlock the full interface for voice activation, however, you can do anything.
The accessibility implications alone are so staggering I very much hope Apple has this on its radar. And in its radar feature request queue.
"Hey, Siri" has gotten better. As of iPhone 6s, it tries to learn your voice to prevent accidental or mischievous activations by others. Yet as Siri spread from phone to tablet to watch to TV, there's little that has been done to prevent accidental activations by you yourself.
Offering alternative or custom activation phrases could help with that. Especially in households with multiple people owning multiple devices, it could, for example, let anyone activate the iPad in the living room rather than the wrist on your stretching arm or phone you left on the table.
This is another one of those long-time wishlist items it would be great to see Apple address.
It's totally not fair that Amazon Alexa users can say "Computer!" in terrible brogue and get their every Star Trek fantasy made manifest and Apple users can't.
It took Apple until 2016 — five long years after launch — to provided the beginnings of a Siri application programming interface for developers. Instead of a bunch of limited, rigid word recipes, though, Apple's ambition is to provide fully fleshed out domains. We've only gotten a few so far:
- Ride booking
- Photo and video
- Payment apps
- VoIP calling
It's not hard to imagine (and long for!) more. Music, for example, to give voice control to Spotify. Podcasts, so Overcast, Pocket Casts, and the like can work in the same way as Apple's Podcasts.
The end game would be Apple's newly acquired Workflow, though. Who needs IFTTT if you can build your own Siri-enabled automations right inside iOS?
Siri has a single name but multiple "personalities". Siri on iPhone can do different things than Siri on Apple TV, which can do different things than Siri on Mac.
Sometimes that makes sense. None of the Mac-specific file handling features are needed or wanted on Apple TV, for example. Sometimes, though, they make no sense at all. Why can I use Siri to control HomeKit from my iPhone, iPad, Apple Watch, and Apple TV but not my Mac?
Consistency is a user-facing feature. If you can't rely on something to be there, you'll soon stop trying. No customer should have to even think about which Siri service is available on which device. We should simply speak and it should simply work. That's the job.
Celebrity voice packs!
D'oh! As much as James Earl Jones and Scarlet Johansson are likely tops of everyone's Siri voice wish-list, you need a lot more meat before you start on the gravy.
What do you want to see next from Siri?
Some parts of Siri are still beyond frustrating. "Turn on the lights." "Sorry, I can't do that." "Turn on the lights." "OK, the lights are on." That should never, not ever happen. Other parts are so cool you feel like you're suddenly in a movie set in the far-flung future. "Siri, Game of Thrones". "OK, the shades are down, the lights are red, and the theater plug is on."
Thanks to some splashy presentations by Google, Amazon, and Facebook, where artificial intelligence, machine learning, and computer vision got thrown around like the new mobile, local, social, the perception is Apple is behind.
Rumors about what's coming next with Siri have been few and far between but our wish lists keep growing.
WWDC 2017 is Apple's next big opportunity to make a statement. Hopefully that statement starts with a much better Siri.