Apple’s latest research into running large language models on smartphones offers the clearest indication yet that the iPhone maker plans to compete with its Silicon Valley rivals in artificial intelligence.
Its researchers write that the thesis, titled “LLM in a Flash,” offers “a solution to a current computational bottleneck.”
His approach “paves the way for efficient estimation of LLMs on devices with limited memory,” he said. Inference refers to how the big language models, the big data sets that power apps like ChatGPT, answer user questions. Chatbots and LLMs typically run in vast data centers with far more computing power than an iPhone.
The paper was published on December 12 but gained wider attention after Hugging Face, a popular site for AI researchers to showcase their work, highlighted it late Wednesday. This is Apple’s second paper on generative AI this month and follows earlier moves to enable image-forming models like stable diffusion to run on custom chips.
Device manufacturers and chipmakers are hoping new AI features will help revive the smartphone market, which had its worst year in a decade, with shipments down an estimated 5 percent, according to Counterpoint Research.
Despite launching Siri, one of the first virtual assistants, in 2011, Apple has largely been left out of the wave of excitement about generative AI that OpenAI has sparked since the launch of its breakthrough chatbot ChatGPT. I have spread in Silicon Valley. Apple has been seen by many in the AI community as lagging behind its big tech rivals, despite hiring Google’s top AI executive, John Giannandrea, in 2018.
While Microsoft and Google have focused more on delivering chatbots and other creative AI services over the Internet from their vast cloud computing platforms, Apple’s research suggests that it will instead focus on AI that Can run directly on iPhone.
Apple’s rivals, such as Samsung, are preparing to launch a new type of “AI smartphone” next year. Counterpoint estimates that more than 100 million AI-focused smartphones will ship in 2024, with 40 percent of new devices offering such capabilities by 2027.
Qualcomm CEO Cristiano Amon, the head of the world’s largest mobile chipmaker, has predicted that bringing AI to smartphones will lead to a whole new experience for consumers and reverse declining mobile sales.
“You’ll see devices launching as early as 2024 with a number of generative AI use cases,” he told the Financial Times in a recent interview. “As those things grow, they begin to make a meaningful difference to the user experience and enable new innovations that have the potential to create a new upgrade cycle in smartphones.”
More sophisticated virtual assistants will be able to anticipate user actions such as texting or scheduling a meeting, he said, while the devices will also be capable of new kinds of photo-editing techniques.
Google this month unveiled a version of its new Gemini LLM that will run “natively” on its Pixel smartphones.
Running the kind of large AI model that powers ChatGPT or Google’s Bard on a personal device poses formidable technical challenges, as smartphones lack the massive computing resources and energy available in a data center. Solving this problem could mean that AI assistants respond more quickly from the cloud and even work offline.
Ensuring that questions can be answered on an individual’s own device without sending data to the cloud is likely to have privacy benefits, a key differentiator for Apple in recent years.
“Our experiment is designed to improve estimation performance on personal devices,” its researchers said. Apple tested its approach on models including the Falcon 7B, a smaller version of an open-source LLM originally developed by the Technology Innovation Institute in Abu Dhabi.
Optimizing LLMs to run on battery-powered devices has been a growing focus for AI researchers. The academic papers aren’t a direct indication of how Apple plans to add new features to its products, but they do offer a rare glimpse into its secretive research labs and the company’s latest technological breakthroughs.
“Our work not only provides a solution to a current computational bottleneck, but also sets a precedent for future research,” the Apple researchers wrote at the end of their paper. “We believe that as LLMs continue to grow in size and complexity, such approaches will be necessary to realize their full potential across a wide range of devices and applications.”
Apple did not immediately respond to a request for comment.