Google has launched Gemini 3 Flash, a new model in the Gemini 3 family designed for speed and efficiency while maintaining frontier intelligence. This model makes Gemini 3's advanced capabilities accessible across Google products at a lower cost. Gemini 3 Flash builds on the success of Gemini 3 Pro and Deep Think mode, processing over 1 trillion tokens daily since launch and excelling in complex reasoning, multimodal understanding, and agentic coding tasks.
Performance and Efficiency
Gemini 3 Flash delivers PhD-level reasoning and knowledge benchmark scores comparable to larger models, such as 90.4% on GPQA Diamond and 33.7% on Humanity’s Last Exam without tools. It achieves 81.2% on MMMU Pro, similar to Gemini 3 Pro. The model is highly efficient, using 30% fewer tokens than Gemini 2.5 Pro for typical tasks, and is 3 times faster while costing less ($0.50 per 1M input tokens, $3 per 1M output tokens).
Developer Benefits
Designed for iterative development, Gemini 3 Flash offers low latency and strong coding performance, scoring 78% on SWE-bench Verified, outperforming previous models including Gemini 3 Pro. It supports complex video analysis, data extraction, visual Q&A, and agentic workflows, enabling intelligent applications like in-game assistants and A/B testing.
Enterprise and Consumer Access
Gemini 3 Flash is available to developers via Gemini API, Google AI Studio, Gemini CLI, and Google Antigravity, and to enterprises through Vertex AI and Gemini Enterprise. It replaces Gemini 2.5 Flash as the default model in the Gemini app, offering users enhanced multimodal reasoning to analyze videos, images, and audio quickly. It also powers AI Mode in Search, providing nuanced, comprehensive answers with real-time information.
Use Cases
- Analyzing short videos and creating improvement plans (e.g., golf swing)
- Real-time sketch recognition
- Audio recording analysis with custom quizzes and explanations
- Voice-driven app creation from unstructured ideas
Gemini 3 Flash combines speed, cost-efficiency, and advanced reasoning to support a wide range of applications for developers, enterprises, and everyday users globally.


