If Apple wants Siri to be crowned leader of the voice-controlled assistant pack, the company has a few improvements to make.
There's been a lot of talk about voice-controlled assistants lately, spurred on by the apparent success of the Amazon Echo and Google's promotion of its own assistant at Google I/O in May. And when there's any tech topic under discussion, the conversation seems to inevitably turn to Apple. Apple brought voice assistants into the public consciousness with the launch of Siri nearly five years ago, but there seems to be a general sense of unease about the current state of Siri.
It's natural, really. The weeks between Google I/O and Apple's own developer conference are traditionally full of analysis of all the ways Apple is trailing behind Google, and much of it will be nullified or countered by the time Apple wraps up its keynote event. But the pace of Siri improvements has seemed a little slow the past few years, and both Google's tech demos and Amazon's clever Echo have definitely whet our collective appetites in terms of what will come next for Apple's remarkably high-profile voice assistant.
The challenge for a technology like Siri is that we all know what the end point is: It's an in-ear assistant that knows everything and is indistinguishable from a real person, like the ones in the movie "Her." The challenge for Apple, Google, and Amazon is that we're a long way off from that. How do we get from here to there? Here's my own personal wish list.
Up in the air
If my Amazon Echo has taught me anything, it's the value of having an intelligent assistant "in the air" in my house. Apple has improved Siri's reach through the "hey Siri" trigger word and the Apple TV's Siri remote, but neither offers the hands-free, vision-free, easily accessible interface that the Echo provides.
Apple's got the skill to build Echo-like hardware (and even has speaker know-how courtesy of its Beats acquisition), and Siri's global reach — it's available in many countries and languages, as opposed to the U.S.-only Echo — would give a "Siri speaker" product a leg up on the Echo, and would be a strong competitor to Google's just-announced Google Home product.
Open the gates
There are also reports that Apple plans to open up Siri to third-party apps, and that's great news. The devil is in the details, of course, but one of Siri's great weaknesses has been the fact that all functionality is built in by Apple, meaning that if Apple doesn't think it's important, Siri can't do it.
Siri needs to connect not just to apps running on iOS devices, but to web services. The Echo's connection with the simple, flexible automation service IFTTT opens the device up to a huge number of different integrations, far more than Siri can offer. Via the Echo's IFTTT gateway, I can turn on lights (ones that aren't compatible with Apple's HomeKit!) and control my living room TV with my voice, all via actions that I've defined myself.
Apple will never be able to anticipate all the ways Siri can be used. That's why it needs a release valve, a gateway to the rest of the world that will allow apps and other Internet data sources to be tasked with providing information from Siri. Third-party app support, if done right, could solve this problem--and make Siri's potential limitless.
End "Have a look"
I also think Siri needs to get better in situations where you can't look at an iPhone screen. Too often, Siri ends up giving up and showing a fragment of a search result on the iPhone screen. "Have a look" is the ultimate Siri cop-out. That can't happen if you're talking to a screenless device like an Apple version of the Amazon Echo, and it's also terrible if you're driving a car.
It strikes me that Siri could be a much better assistant to people who are driving their cars. CarPlay is a nice idea, but far more people have Bluetooth audio in their cars than have CarPlay. It's unsafe to look at an iPhone screen while you're driving, but Siri can theoretically replace that need. It performs some basic capabilities now — reading recent text messages when you ask, for example — but too much of what it offers requires you to look at the screen. (Siri should also be much more wordy when it knows I'm driving, including offering to read the text of notifications I receive and providing a more interactive interface for processing lists of information.)
When it comes to Siri's capabilities, it's way too easy to hit the wall. If I'm stuck in a car for a couple of hours, I'd like Siri to be able to play the latest news, shuffle the contents of a playlist, and play a podcast from a third-party app. But I should also be able to triage email, check on the conversations in my Slack groups, make a restaurant reservation, check my Twitter replies, and a whole lot more. Siri should be able to keep me connected to my personal data sources when I can't look at my phone screen, and right now it can't for the most part.
Siri also just needs to get better at making guesses. When I asked it if I had any new email from Erika, it searched (and failed to find) any emails from "Erica." Some fuzzier matches — or requests for more detail — would be helpful. But too many of my Siri interactions just end in confusion.
What Siri does well today are tasks that are more efficiently commanded via voice than via a finger on a screen. I use Siri primarily to set timers and convert measurements, because Siri's way easier at that than the iPhone UI. The more Siri can do better than I can with my fingers and my iPhone screen, the more I'll use it. This means that, in addition to supporting better data sources, It needs to keep getting smarter about guessing what I'm trying to tell it.
The true path to intelligence
In the medium term — before Siri becomes a sentient being — Apple's assistant and all of its kin need to get a lot better at holding conversations, collecting information, performing tasks, and reporting back. My dream is being able to tell my digital assistant to ask my wife if she wants me to order dinner, and have the assistant do the rest — texting her, waiting for an answer, and then relaying the answer back to me. Right now using these voice-driven interfaces is a lot like using a command-line interface back in the day — you have to say a sequence of words in just the right order, and if it doesn't work, you need to start over. We need to be able to reason with these assistants, to explain ourselves. Using a digital assistant needs to become more like a conversation and less like a sequence of commands, because the promise of this technology is that there should be no learning curve.
There should be no such thing as a Siri power user. That's something for Apple to shoot for — but in the meantime, third-party data sources and a reduced reliance on using the iPhone screen will get things going in the right direction.