Apple's AI plans have been boosted by new research that could allow ChatGPT-like large language models to run on iPhones

Siri on iOS
(Image credit: Christine Romero-Chan / iMore)

Apple has long been rumored to have plans for a big AI push into and beyond 2024, and now new research might go a long way to making that a reality all while being able to maintain Apple's demand for security and privacy.

To date, large language models (LLMs) like those on which ChatGPT is based have been powered by computers housed in data centers, and accessed via a webpage or an iPhone app. They're huge pieces of software that require equally huge amounts of resources to work properly, making it problematic to try and make them run locally on phones like the upcoming iPhone 16. But by having LLMs run in data centers there is a concern about privacy to consider, and with Apple already working to keep as many Siri requests on-device as possible, it's no surprise that Apple might want to do the same with any LLM implementation it is working on.

Now, a research paper might have the answer and it could open the door to Apple's in-house Apple GPT making a debut outside of Apple Park. But if Siri really is going to get a big upgrade, could the 2024 iPhones come too soon?

On-device processing

The research paper, titled "LLM in a flash: Efficient Large Language Model Inference with Limited Memory," is authored by multiple Apple engineers and discusses how an LLM could be used on devices with limited RAM (or DRAM), like iPhones. The paper would also be useful for bringing Siri upgrades to similar RAM-constrained devices like low-end MacBooks and the iPad, not to mention the Apple Watch.

"Large language models (LLMs) are central to modern natural language processing, delivering exceptional performance in various tasks," the paper begins. "However, their intensive computational and memory requirements present challenges, especially for devices with limited DRAM capacity. This paper tackles the challenge of efficiently running LLMs that exceed the available DRAM capacity by storing the model parameters on flash memory but bringing them on demand to DRAM."

Flash storage, or the storage you choose when buying your iPhone, is much more plentiful and can be carved out for storing the LLM data. The paper discusses different ways of using a device's flash storage in place of DRAM. There are two main ways discussed including "windowing" and "row-column bundling."

The paper explains that "these methods collectively enable running models up to twice the size of the available DRAM, with a 4-5x and 20-25x increase in inference speed compared to naive loading approaches in CPU and GPU, respectively."

Obvious benefits

The benefits of such an approach are obvious. Not only would storing an LLM on an iPhone be beneficial in terms of removing the need to store it in a remote data center, and improving privacy, but it would also be much faster. Removing the latency created by poor data connections is one thing, but the speed increase goes beyond that and could make Siri respond more accurately and more quickly than ever before.

Apple is already rumored to be working on bringing improved microphones to the iPhone 16 lineup, likely in an attempt to ensure Siri hears what people ask of it more clearly. Couple that with the potential for an LLM breakthrough and the 2024 iPhones could have some serious AI chops.

More from iMore

Oliver Haslam

Oliver Haslam has written about Apple and the wider technology business for more than a decade with bylines on How-To Geek, PC Mag, iDownloadBlog, and many more. He has also been published in print for Macworld, including cover stories. At iMore, Oliver is involved in daily news coverage and, not being short of opinions, has been known to 'explain' those thoughts in more detail, too.

Having grown up using PCs and spending far too much money on graphics card and flashy RAM, Oliver switched to the Mac with a G5 iMac and hasn't looked back. Since then he's seen the growth of the smartphone world, backed by iPhone, and new product categories come and go. Current expertise includes iOS, macOS, streaming services, and pretty much anything that has a battery or plugs into a wall. Oliver also covers mobile gaming for iMore, with Apple Arcade a particular focus. He's been gaming since the Atari 2600 days and still struggles to comprehend the fact he can play console quality titles on his pocket computer.

  • aergern
    If it's just going to be for the 16 and above, doesn't really matter where it runs as it's not going to be a major subset of devices. We'll see but I don't have much hope that it's not going to be an iPhone 16 exclusive. /shrug