Gemini 2.5 Family Models Now Available for Production Use

Gemini 2.5, a hybrid reasoning model, has reached the Pareto Frontier of cost and speed. The company is releasing stable versions of its 2.5 Pro and Flash models, which developers have already been using in-production for several weeks. Additionally, Gemini 2.5 Flash-Lite, the most cost-efficient and fastest model yet, is now available in preview.

The new models offer improved performance across coding, math, science, reasoning, and multimodal benchmarks. They excel at high-volume, latency-sensitive tasks like translation and classification, with lower latency than previous versions.

Gemini 2.5 Flash-Lite boasts similar capabilities to its predecessors, including the ability to turn thinking on at different budgets, connecting to tools like Google Search, multimodal input, and a 1 million-token context length. The model is now available in Google AI Studio, Vertex AI, and the Gemini app.

Developers can start building with the preview version of Gemini 2.5 Flash-Lite and provide feedback. The company is excited to see what users continue to build with the new models, which aim to enhance production applications with confidence.

Source: https://blog.google/products/gemini/gemini-2-5-model-family-expands