Google DeepMind has introduced updates to the Gemini family of models, including 1.5 Flash, a lightweight model for speed and efficiency, and Project Astra, a vision for future AI assistants. Gemini 1.5 Flash is designed for speed and efficiency at scale, and is available in public preview in Google AI Studio and Vertex AI.
The Gemini family of models has been updated with the next generation of open models, Gemma 2. The new 1.5 Flash is optimized for high-volume tasks at scale, and excels at summarization, chat applications, image and video captioning, and data extraction.
The 1.5 Pro model has been improved, with enhancements in code generation, logical reasoning, planning, multi-turn conversation, and audio and image understanding. It can follow complex instructions, and users can steer model behavior by setting system instructions.
The Gemini Nano model now includes images as inputs. Applications using Gemini Nano with Multimodality will understand the world through text, sight, sound and spoken language.
The Gemma family is expanding with PaliGemma, a vision-language model. The Gemma 2 model has a new architecture for performance and efficiency, and will be available in new sizes.
As part of Google DeepMind’s mission, they have shared progress in building future AI assistants with Project Astra. Prototype agents have been developed that can process information faster by encoding video frames, combining video and speech input into a timeline of events, and caching this information for efficient recall.
Google DeepMind continues to explore new ideas and unlock the possibility of new Gemini use cases.