To date, the most widely used assistants allow you to interact with the phone through voice commands. The utility of being able to use your mobile device hands-free is obvious.
9to5Google has one. Rebooted newsletter which highlights Google's biggest stories with added commentary and other news. Sign up here!
For Google, its biggest investment in this area in 2019 was the new Google Assistant that debuted on the Pixel 4. Google's goal was to “let you instantly operate your phone with your voice, multitask across apps, and complete complex operations, all with nearly zero latency.”
With on-device voice processing to drive this assistant, the company boldly proclaims that “tapping to operate your phone almost feels lazy.”
Unfortunately, the experience — which still exists on Pixel phones today — requires users to stick to specific phrases rather than speaking naturally and automatically understanding intent. Meanwhile, the possible actions were very limited and did not work with many apps.
Apple is taking another stab at Siri on iOS 18 with Intelligence, with large language models (LLMs) potentially key to a voice assistant that can use any app on your phone.
Google is researching the same thing, and may very well still build one. However, after I/O 2024, I no longer think this is a priority for the company.
Rather, Google ultimately wants to create an AI assistant that helps you in the physical realm. The idea is that most of your questions and problems are still occurring in the real world without a digital equivalent.
Its main purpose is to point your phone (or smart glasses in the future) at something and ask for more information or help.
That's what Google showed with Project Astra, the interactive Gemini Live experience that lets you have a natural two-way conversation. Going Live with Gemini is expected to arrive this year, first with the audio aspect and then with the camera capabilities.
Meanwhile, much of your information is stored in the form of photos and videos. Gemini-powered Ask Photos turns your library into a repository of real-world knowledge that Google can use to help you.
Taking pictures of information in the real world and organizing Google is a real time saver and inherently helpful. One of my favorite examples of this from I/O is something that isn't particularly flashy. Google Calendar, Keep, and Tasks Gemini Extensions Coming:
…you'll be able to do things like take a photo of your child's school curriculum and have Gemini create a calendar entry for each assignment, or take a photo of a new recipe and add it to your cape as a shopping list. .
The Gemini Advanced is also getting an immersive journey planner, while an example of an agentic experience that Google has chosen starts with taking a photo of the shoes you bought to initiate the return process. . Second, Gemini was tasked with helping you move to a new city and making all the necessary changes.
Something that can help you navigate the world certainly seems more polished and impressive than an assistant that can transcend your phone, which Google can do very well. . Android users can improve the phone assistant and it remains to be seen how useful something like Astra is, but you can't accuse Google of swinging for the stars.
Project Astra's stated goal is to build a “universal AI agent that can be truly helpful in everyday life.” Camera input that provides a live view of the world truly solves a problem that dates back to Google's origins: easily wording certain questions and their real-world context for a text query. I cannot be kept.
FTC: We use auto-affiliate links to generate income. More.