Siri OS: Could natural language be Apple's next big leap forward?

Siri OS: Could natural language be Apple's next big leap forward?

Another idea we've been talking around a lot, both in articles and on the iMore show and the Iterate podcast, is the future of Apple's virtual personal assistant, Siri and what it means for current graphical user interfaces. While working on the iOS 6 Siri previews, however, it began to coalesce. Watching the WWDC 2012 keynote, Apple senior vice president of iOS, Scott Forstall showed off the updates planned for Siri in iOS 6, and used an interesting turn of phrase -- "you can even tap to watch the trailer right here in Siri".

Not "using Siri". Not "with Siri". "In Siri."

Siri has never been part of the original iOS Home page system, Spring Board. It's always been a layer unto itself. Siri even includes a robust system of widgets, including alarms and timers, messaging and email, Facebook and Twitter, maps and locations, and info sheets including restaurants, sports, movies, and more. Unlike Android, you can't pin any of them persistently to the Home screen. They exist only in Siri.

It's a parallel interface layer that uses natural language -- voice -- instead of multitouch -- gestures. Right now it's an extremely limited, not always reliable one, but it is one. With iOS 6, it can even launch apps, just like the Home screen's Spring Board.

It's not hard to imagine Siri continuing to improve and expand until it can do pretty much everything the Home screen, Spring Board system can do, only with voice rather than multitouch gestures.

All major revolutions in computing have been as a result of the mainstreaming of a new and more accessible interface paradigm. The Apple II popularized command-line interfaces (CLI). The Mac popularized graphical user interfaces (GUI). The iPhone popularized multitouch user interfaces. Even the iPod's success could arguably be tied, at least in part, to the advent of the clickwheel as interface.

Could Siri popularize natural language user interfaces? Could the next big shift, and the next huge adaption curve in the mainstream market come when Siri is ready for prime time? Could natural language interface do to multitouch and GUI what multitouch and GUI did to the command-line?

As much as Minority Report teased us with multitouch before Apple put it into hundreds of millions of hands, natural language has been teased even longer. Star Trek had "Computer". 2001 had "HAL". Knight Rider had "KITT". And on and on. Collectively, we've had hundreds if not thousands of science fiction stories promising a future filled with machines that we could not only talk to, but that we could talk with. It's the only step left before Mitchel Gant, Firefox, and having to think in Russian...

Google's reaching for this future as well. Google Now tries to do what Siri does and even more. Siri parses queries and tries to understand context. Google Now tries to predict context before you even query it. Palm talked about this for webOS years ago -- your phone knows where you are, what time your appointment is, and what traffic is like, so why should you have to carry that cognitive load? Why can't your phone realize you'll be late, alert you, provide alternate directions, and email your contact to let them know you're running late? Palm never delivered on that dream, but Google Now is starting too.

Again, it's not hard to imagine Apple will implement similar features into Siri, since the iPhone and Apple's new Maps system, among other things, can provide similar information.

And, as Apple tried to prove at the wrong time, in the wrong way, with the wrong device -- the buttonless iPod shuffle -- when natural language is the interface, the size of the screen, even the existence of a screen, stops mattering. Computers can become tiny, wearable, embeddable, invisible.

The migration from CLI to GUI to multitouch has all been driven by the urge to democratize computing. (The more people who can use computers, the more people you can sell computers to.)

When natural language becomes easier to use than multitouch, and mainstream users start using them more, we just might see the next great transition in interface, and the next great expansion in computer user base.

And Apple will have spent years position Siri to be there.

Apple II. Mac. iPhone. Siri.

Have something to say about this story? Leave a comment! Need help with something else? Ask in our forums!

Rene Ritchie

EiC of iMore, EP of Mobile Nations, Apple analyst, co-host of Debug, Iterate, Vector, Review, and MacBreak Weekly podcasts. Cook, grappler, photon wrangler. Follow him on Twitter and Google+.

More Posts



← Previously

Draw and handwrite iMessages and texts with Grafiti for iPhone [jailbreak]

Next up →

How to send texts using iMessage

Reader comments

Siri OS: Could natural language be Apple's next big leap forward?


At this point I don't want my phone to attempt to predict what I want. My life is varied enough and I'm doubtful that my phone is going to provide the information I need with more frequency than it provides me info I don't need.

At this point we should be looking at at more complex queries. Neither system handles the type of query that a 5 year old can easily manage "Go grab the the remote from the kitchen table, take it to your mother and then bring it back when she's done"

At his point both Siri and Google now and spit back 1 answer per query but what if my question involves multiple answers? "What is the population of the countries that surround Germany?" "When was World War II and who were the protagonists?"

That's a whole level of complexity above what we have now. Taking the easy way out would encompass my phone doing things I really don't need i.e telling me when the train is coming as I enter the subway. Whether the train is late, early or derailed is beyond my sphere of control.

Siri is a huge deal and it's no surprise that developers don't have access to it yet. Apple needs to get the semantics working right across multiple languages and then further improve the parsing process so that it can handle nested and conditional queries with a high degree of accuracy. If not ...people will not use it that much.

Bit OT; but think the word you were looking for was "belligerents". WW2 was a horrible global war, not theatre. No wonder Siri was confused.

When (and if) Siri not just launch but interact with apps, I would start to believe that.

Google Now is a great idea, but I'm not sure if I want it. The privacy issue bothers me… But I think the future is going that way.

I have it. On my rooted gs2 skyrocket. I like it so far. Privacy isn't too big for me because the goo and AT&T already know where I am. Especially when I use places to find local things.

An image can convey a considerable amount of data, so the phone screen is not going away anytime soon. Even the Enterprise in Star Trek had screens. That said, I agree that Siri is a very big deal.

I'll believe Siri is becoming something more when I don't have a apoplectic fit just trying to get her to call my wife while I'm driving home. That's the only thing I miss about WP7: rock-solid voice dialing. It's never wrong. I end up cursing Siri out every day.

The new Siri has some pretty great features. However, 95% of the time, I don't want to talk to my phone. Maybe I am in public, in a quiet place, or I am at work and don't want to ask Siri what the Red Sox score is out loud.

Siri is a great search feature and I would love to be able to type inquiries in as well as speak them.

This is the exact problem I see in today's age. More and more people are interested in text messaging than calling or talking on a phone. If people don't want to even communicate by voice THROUGH a phone why would they want to communicate by voice WITH a phone. In most instances vocal communication is faster, more effective and provides more context, but still a growing population of smartphone users are choosing to communicate with and through their phones with their fingers instead of their voice.

Re: "The more people who can use computers, the more people you can sell computers to."

This is a concept that some geeks resist tooth and nail. That tiny minority still thinks that technology should be hard to be use, and don't want it to change. Because they spent so much time learning the old way, and because showing off their obsolescent skills makes them feel smart and / or superior.

I don't know how many times I heard "This whole GUI thing is a waste of CPU cycles. I prefer typing commands in the terminal because it's faster." (And I'm not going to say how long ago I heard all that.)

Technology is relentlessly propelled toward the mass market by economics. If a technology doesn't reach consumers, or isn't adopted widely by corporate IT, it could either die or remain way out at the edges of the geek fringe. I think Siri is on its way to becoming a serious alternative to typing and multi-touch.

Having said all that, I think it would be inappropriate to rely 100% on a voice interface. Especially in libraries and other quiet areas and situations that require silence.

"Having said all that, I think it would be inappropriate to rely 100% on a voice interface. Especially in libraries and other quiet areas and situations that require silence."

Exactly! It could only ever work in tandem with a multitouch/GUI interface. It's not realistic to talk to your electronics in public. The idea is absurd. Unless, of course, there isn't a "public" in the future. Maybe everybody will stay inside and never interact with anybody else.

Having said that. I like where Siri is going. I use it now and will continue to use it as long as it keeps improving.