Gemini 3.2 Flash: Faster, Cheaper, Smarter AI Model Leaked
Summary
Google's next AI model, Gemini 3.2 Flash, is rumored to be faster, cheaper, and smarter. Leaks suggest this new model aims for near-flagship performance with much lower latency and pricing. It might even be renamed Gemini 3.5 Flash before its launch. The model is optimized for speed, potentially returning responses in under 200 milliseconds for many prompts. This low latency could greatly benefit AI assistants, real-time voice conversations, and mobile AI applications. Despite its lightweight design, Gemini 3.2 Flash is expected to perform close to Gemini 3.1 Pro for tasks like reasoning and coding. Google is reportedly achieving these gains through advanced distillation and sparse architecture optimizations. The rumored pricing is aggressive, around $0.25 per 1 million input tokens and $2 per 1 million output tokens, making it potentially very cost-effective. This could significantly reduce operating costs for both Google and developers.
This is an AI-generated audio summary. Always check the original source for complete reporting.