Pro-level reasoning at 3x the speed. Process PDFs, images, video, and documents with lightning-fast AI — optimized for real-time experiences.
A lightweight, high-speed multimodal model optimized for rapid responses, streaming input, and frequent interactions.
Sub-second to near-real-time feedback for instant user experiences.
Designed for high-volume production use without breaking your budget.
Built for many calls per user session in interactive applications.
Especially powerful for video, images, and PDF understanding.
Eliminates the trade-off between smart-but-slow and fast-but-dumb AI models.
Great reasoning, but 10+ seconds per response
Quick replies, but can't handle PDFs, images, or complex inputs
PhD-level reasoning (90.4% GPQA Diamond) at sub-second latency
Gemini 3 Flash excels in scenarios where you need to quickly understand complex inputs.
PDF → Structured JSON extraction with 15% accuracy boost over previous models.
Extract meaning and normalize messy data from visual inputs.
Process up to 1 hour of video in a single prompt.
78% success rate on SWE-bench Verified for GitHub issues.
Low latency keeps the experience feeling "alive".
High-volume tasks where cost efficiency matters.
Gemini 3 Flash is uniquely positioned for users who need Pro-level intelligence with budget or latency constraints.
Rapid prototyping with "Vibe Coding", production-ready without the latency wall.
Visual data extraction, high-volume copy testing, analyze thousands of product photos.
Higher success rate in tool calling, self-correction in coding environments.
15% accuracy boost for messy handwriting, legal contracts, financial tables.
Interactive tutoring and personal coaching with fast responses.
Flash as front-end perception layer, pair with heavier models for reasoning.
Gemini 3 Flash outperforms Gemini 2.5 Pro in 18 of 20 major benchmark categories while operating 3x faster.
Technical capabilities that make Gemini 3 Flash powerful for real-world use cases.
Control speed vs. depth with minimal, low, medium, and high thinking budgets.
First Flash model to match Pro models on elite benchmarks.
Count objects, zoom into regions, extract data from complex layouts.
Multimodal function calling with images or PDFs in tool responses.
Process entire codebases, research libraries, or long videos.
1 hour video, 8.4 hours audio, 900 images per prompt.
Both Gemini 3 Flash and Gemini 3 Pro deliver pro-level intelligence with different optimizations.
| Feature | Gemini 3 Flash | Gemini 3 Pro |
|---|---|---|
| Best For | Speed & High-volume Agents | Deep Research & Strategy |
| GPQA Diamond | 90.4% | 91.9% |
| Speed | 3x faster ✅ | Slower, deeper thinking |
| SWE-bench | 78.0% ✅ | 76.2% |
| MMMU-Pro | 81.2% ✅ | Slightly lower |
| Primary Use | Interactive assistants, coding agents | Complex architecture, deep research |
Detailed Gemini 3 Flash specs for developers and enterprise users.
Major companies and developers are already using Gemini 3 Flash in production.
"Sets a new standard for accuracy with 15% precision boost in extracting data from handwriting and dense financial tables."
— Box
"Feels instant — Universal Praise on speed. The 'king of Vibe Coding' in developer communities."
— Community Sentiment
"Disruptive — turning search results into functional apps with AI Mode."
— Media Coverage
Yes, in reasoning capability. It scores 90.4% on GPQA Diamond (PhD-level reasoning), very close to Gemini 3 Pro's 91.9%. However, it's optimized for speed, not maximum depth.
Gemini 3 Flash is 3x faster than Gemini 2.5 Pro while outperforming it on 18 of 20 major benchmarks. It's designed for sub-second response times.
A new API parameter that controls reasoning depth. Flash supports four levels: Minimal (instant), Low (quick queries), Medium (general use), and High (complex logic). This gives you explicit control over the speed vs. depth trade-off.
Yes, natively. It can process PDFs (rendered as images), up to 1 hour of video (45 min with audio), up to 8.4 hours of audio, and up to 900 images per prompt.
If your task requires deep multi-step reasoning, complex math proofs, or long research synthesis, consider Gemini 3 Pro. Or use the hybrid pattern: Flash for input processing, Pro for final reasoning.
1 million input tokens and 65,536 output tokens. This is massive — enough for entire codebases, long videos, or hundreds of documents.
Available now across multiple platforms. Experience pro-level AI at lightning speed.