Google's Project Astra is an application of AI that uses your phone's camera and AI to locate noise makers, misplaced items, and more. It was showcased at Google's I/O 2024 conference. The project results from Google's ambition to develop universal AI agents to assist in everyday life.
Project Astra appears to be an app with a viewfinder as its main interface. In a demonstration video, the Gemini AI could identify a speaker and explain its parts, create an alliteration about crayons, identify and explain parts of code, and even remember where it had seen a pair of glasses that were no longer in the frame.
The app also demonstrated the ability to provide suggestions to improve a system's speed and generate creative ideas, such as a band name for a plush tiger toy and a golden retriever.
Project Astra processes visual data in real time and remembers what it has seen. It achieves this by continuously encoding video frames, combining the video and speech input into a timeline of events, and caching this information for efficient recall.
Google has also been enhancing the vocal expression range of its AI, giving the agents a more comprehensive range of intonations. This is similar to the human-like responses provided by Google's Duplex voice assistant technology.
While Project Astra is still in its early stages, Google's DeepMind CEO, Demis Hassabis, suggests that these assistants could be available through your phone or glasses.
Some of these capabilities are expected to come to Google products, like the Gemini app, later this year.