Google vs Google

Gemini 3.1 Pro vs Gemini 2.5 Flash Image

Compare Gemini 3.1 Pro and Gemini 2.5 Flash Image across pricing, context window, capabilities, benchmarks, and API access to choose the better fit for long-context workloads versus long-context workloads.

Gemini 3.1 Pro

Feb 19, 2026 1,048,576 context 65,536 tokens output

Gemini 2.5 Flash Image

Oct 07, 2025 1,048,576 context 32,768 tokens output

Overview ↓ Pricing ↓ Capabilities ↓ Benchmarks ↓ Community ↓ Tools ↓ Verdict ↓ FAQ ↓ Related ↓

Overview Comparison

Structured side-by-side differences for the highest-signal model metadata.

Gemini 3.1 Pro

Gemini 2.5 Flash Image

Provider

The entity that currently provides this model.

Gemini 3.1 Pro Google

Gemini 2.5 Flash Image Google

Model ID

The routed model identifier exposed by upstream providers.

Gemini 3.1 Pro google/gemini-3.1-pro-preview

Gemini 2.5 Flash Image google/gemini-2.5-flash-image

Input Context Window

The number of tokens supported by the input context window.

Gemini 3.1 Pro 1,048,576 tokens

Gemini 2.5 Flash Image 1,048,576 tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

Gemini 3.1 Pro 65,536 tokens tokens

Gemini 2.5 Flash Image 32,768 tokens tokens

Open Source

Whether the model's code is available for public use.

Gemini 3.1 Pro No

Gemini 2.5 Flash Image No

Release Date

When the model was first released.

Gemini 3.1 Pro Feb 19, 2026

Gemini 2.5 Flash Image Oct 07, 2025

Knowledge Cut-off Date

When the model's knowledge was last updated.

Gemini 3.1 Pro February 2026

Gemini 2.5 Flash Image 2025-01-31

API Providers

The providers that currently expose the model through an API.

Gemini 3.1 Pro

Google, OpenRouter, Vertex AI, Gemini API

Gemini 2.5 Flash Image

Google, Vertex AI, Gemini API

Modalities

Types of data each model can process or return.

Gemini 3.1 Pro

Text Image File Audio Video Code

Gemini 2.5 Flash Image

Text Image

Pricing Comparison

Compare current token pricing before you choose the cheaper or more scalable API option.

Gemini 3.1 Pro Google

Input price $2.00 Per 1M tokens

Output price $12.00 Per 1M tokens

Gemini 2.5 Flash Image Google

Input price $0.30 Per 1M tokens

Output price $2.50 Per 1M tokens

Capabilities Comparison

See where each model overlaps, where they differ, and which one supports more of the features you care about.

Capability

Gemini 3.1 Pro

Gemini 2.5 Flash Image

Agentic Task Execution Supports autonomous, long-horizon task execution with improved tool orchestration and stability, suited for structured domains like finance and spreadsheet workflows.

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

Character Consistency Maintains consistent visual representations of characters across multiple generated images, supporting sequential storytelling and narrative workflows.

Gemini 3.1 Pro —

Gemini 2.5 Flash Image Supported

Code Generation Produces and analyzes code across multiple programming languages, with measurable gains on SWE benchmarks and real-world software engineering environments.

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

Configurable Thinking Offers a medium thinking level setting that allows users to tune the trade-off between reasoning depth, response speed, and token cost per request.

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

File

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

Image

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image Supported

Image Generation Generates images from natural language text prompts, drawing on Gemini's world knowledge to produce contextually accurate visual outputs.

Gemini 3.1 Pro —

Gemini 2.5 Flash Image Supported

Large Context Window Supports a context window of 1,048,576 tokens, allowing detailed prompts, instructions, and multiple image references to be included in a single request.

Gemini 3.1 Pro —

Gemini 2.5 Flash Image Supported

Long Context Window Processes up to 1,048,576 tokens in a single request, enabling analysis of entire codebases, lengthy documents, or extended multi-turn conversations without truncation.

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

Multi-Image Blending Accepts arrays of image URLs as input and combines multiple source images into a single cohesive output in one request.

Gemini 3.1 Pro —

Gemini 2.5 Flash Image Supported

Multi-Step Reasoning Applies structured reasoning chains to complex problems, achieving a 77.1% score on the ARC-AGI-2 benchmark across logic, planning, and inference tasks.

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

Multimodal Input Accepts and reasons over text, images, video, audio, and code within a single unified model, without requiring separate specialized models per modality.

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

Natural Language Editing Applies targeted transformations to existing images using plain text instructions, enabling precise edits without manual masking or selection tools.

Gemini 3.1 Pro —

Gemini 2.5 Flash Image Supported

Reasoning

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

Structured Output

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image Supported

Text

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image Supported

Tool Use Accepts tool definitions as inputs and can invoke external functions or APIs during a response, enabling integration with custom workflows and data sources.

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

Tools

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

Video

Gemini 3.1 Pro Supported

Gemini 2.5 Flash Image —

World Knowledge Integration Leverages Gemini's language understanding to ground image generation in factual and contextual knowledge, improving accuracy for real-world subjects and scenes.

Gemini 3.1 Pro —

Gemini 2.5 Flash Image Supported

Benchmark Comparison

Shared benchmark rows make it easier to compare performance where both models have published scores.

Benchmark	Gemini 3.1 Pro	Gemini 2.5 Flash Image
ARC-AGI-2 Novel abstract reasoning and pattern recognition	Gemini 3.1 Pro 77.1%	Gemini 2.5 Flash Image N/A
BrowseComp Complex web browsing and information retrieval	Gemini 3.1 Pro 85.9%	Gemini 2.5 Flash Image N/A
GPQA Diamond PhD-level science questions (biology, physics, chemistry)	Gemini 3.1 Pro 94.1%	Gemini 2.5 Flash Image N/A
HLE Questions that challenge frontier models across many domains	Gemini 3.1 Pro 44.7%	Gemini 2.5 Flash Image N/A
MCP-Atlas Tool Use Structured tool use via Model Context Protocol	Gemini 3.1 Pro 69.2%	Gemini 2.5 Flash Image N/A
MMMLU Multilingual and multimodal understanding	Gemini 3.1 Pro 92.6%	Gemini 2.5 Flash Image N/A
SciCode Scientific research coding and numerical methods	Gemini 3.1 Pro 58.9%	Gemini 2.5 Flash Image N/A
SWE-bench Pro Challenging real-world software engineering tasks	Gemini 3.1 Pro 54.2%	Gemini 2.5 Flash Image N/A
SWE-bench Verified Real GitHub issues requiring multi-file code fixes	Gemini 3.1 Pro 80.6%	Gemini 2.5 Flash Image N/A
Terminal-Bench 2.0 Agentic coding and terminal command tasks	Gemini 3.1 Pro 68.5%	Gemini 2.5 Flash Image N/A
τ²-bench Retail Agentic tool use in retail scenarios	Gemini 3.1 Pro 90.8%	Gemini 2.5 Flash Image N/A
τ²-bench Telecom Agentic tool use in telecom scenarios	Gemini 3.1 Pro 99.3%	Gemini 2.5 Flash Image N/A

Community discussion

What Reddit discussions say about Gemini 3.1 Pro vs Gemini 2.5 Flash Image

Gemini 3.1 Pro and Gemini 2.5 Flash Image are both surfacing live Reddit discussions, giving this comparison a community layer beyond specs and benchmarks.

The most visible threads right now are clustered in r/GeminiAI, r/Bard, r/singularity.

Gemini 2.5 Flash Image r/singularity 1,601 upvotes 352 comments August 28, 2025

With respect to the production of pornography, we have split the atom

Playing around with Gemini 2.5 Flash Image (sorry, not calling it that other name) just now, I felt like Oppenheimer staring at the fireball. Such an enormity of new power, so suddenly.

The masturbators of tomorrow will marvel that people were once limited to non-customized pornography.

Seriously, I think this changes everything.

Open Reddit thread

Gemini 2.5 Flash Image r/wallstreetbets 710 upvotes 208 comments October 24, 2025

Daily GOOGLE GOON SQUAD: $GOOGL$ is the true AI king and it’s about to print

Fellow Regards and Degenerates,

I'm here to tell you that $GOOGL / $GOOG is the most criminally undervalued stock in mega-cap tech because it’s the undisputed leader in the technologies that define the next century. Forget the short-term noise. This is a deep dive into the strategic moat that others can't even dream of crossing.

**1. Future of Tech**

**Waymo**

Google's Waymo is WAY MORE than a competitor. It's the only fully scaled, commercialized Level 4 self-driving service available to the public. It operates 24/7 robotaxi services in multiple major US cities like Phoenix, San Francisco, Los Angeles, Austin and testing in other cities

In San Francisco, its massive surge in volume has already resulted in its market share surpassing Lyft's, making it the city's second-most popular ride-hailing service. It’s the result of a decade-plus of calm, deep-pocketed investment, allowing it to log over 100 million fully autonomous miles and complete over 10 million paid trips.

The sheer mileage, the complexity of the scaled deployments—which have demonstrated an 80% reduction in injury-causing crashes compared to human drivers—and the fact that they are now expanding internationally to places like Tokyo and London is a moat that no other company has even come close to building. The heck, there is no second competition in autonomous self-driving.

**Quantum Leap for Humanity**

The recent quantum discovery by Google, featuring its Quantum Echoes algorithm, is a major step toward making quantum computers a practical, powerful tool. This breakthrough, which demonstrated verifiable quantum advantage on the Willow quantum chip, is set to accelerate scientific discovery across key industries.

Specifically, the ability to perform verifiable quantum advantage means we can now trust a quantum computer to reliably solve real-world physics problems that are computationally infeasible for classical machines.

What Quantum Echoes Will Do

This breakthrough directly accelerates the original promise of quantum computing:

* Design Better Drugs and Cures: The Quantum Echoes algorithm ran 13,000 times faster on Willow than the best classical algorithm on one of the world's fastest supercomputers. This technique—which is already being used in a quantum-enhanced version of Nuclear Magnetic Resonance (NMR) to study molecular structure—will dramatically cut the time it takes to discover and develop new, more effective medicines by providing unprecedented insights into how potential drug compounds interact with disease targets.
* Create Advanced New Materials: The algorithm's power to reveal previously undetectable details about atomic interactions will unlock the discovery and design of novel materials. This is vital for creating the next generation of:
* High-Performance Batteries (for electric vehicles and energy storage).
* More Efficient Solar Cells.
* Lighter, Stronger Polymers for manufacturing and aerospace.

In short, Google's Quantum Echoes is an engineering milestone that moves quantum computing from a theoretical concept to a practical, verifiable machine for solving humanity's hardest scientific problems.

Think of it this way - The average age of a few generations from now will be approximately 100 years. This is truly remarkable.

**AI: The Medical Revolution**

AI, particularly from Google DeepMind, is already achieving breakthroughs that save time, money, and lives. This is AI's immediate, profitable impact.

* AlphaFold & Isomorphic Labs: AlphaFold, an AI model from DeepMind, solved the 50-year-old problem of protein folding. This monumental achievement earned Google DeepMind's Demis Hassabis and John Jumper a share of the 2024 Nobel Prize in Chemistry (along with David Baker). In simple terms, proteins are the body's tiny machines. Knowing their 3D shape is the blueprint for creating drugs. AlphaFold can find that blueprint in minutes, a process that used to take years. Isomorphic Labs is now using this and other advanced AI to design new small-molecule drugs from scratch at "digital speed," accelerating drug discovery from years to months.
* AI and Quantum Synergy: This is where the magic happens. AI (the brain) helps guide the ultra-powerful quantum computer (the brawn) by identifying which molecules to focus on and then analyzing the quantum simulation results. This hybrid approach makes breakthroughs possible that would be computationally impossible otherwise. Google is the only company with a dominant lead in *both* technologies.

**2. AI Supremacy: The Foundational Architect**

The current AI boom exists because of Google, and its competitive position is strong due to decades of strategic investment focused on making powerful technology affordable enough to scale effectively. By now, it is widely known that the foundational technology for modern AI—the Transformer architecture—was created by Google.

* Models: Leading Across the Modalities Google has established market-leading or top-tier models across text, image, and video.
* Text & Multimodal: The Gemini family of models sets the pace in multimodal reasoning, handling text, code, audio, and video inputs.
* Image (Nano Banana/Imagen): The technology powering Nano Banana (Gemini 2.5 Flash Image) excels at enterprise-critical tasks like advanced editing that preserves character/product consistency across iterations—a crucial capability for marketing and design.
* Video (Veo): Google's cutting-edge video generation models, like Veo, are rapidly advancing the state-of-the-art in creating high-quality, long-form video content.
* Infrastructure: The TPU Efficiency Moat Google designs its own custom AI chips, the Tensor Processing Units (TPUs), which are engineered for peak AI efficiency and low-cost operation. They have spent years perfecting this hardware because a tech needs to be affordable for it to scale and work. This commitment to efficiency is so superior that competitors, including major AI labs, must increasingly rely on the latest generations of Google's custom hardware by coming to Google Cloud Platform (GCP) to train and run their own cutting-edge models. This external validation proves that Google's approach is about making large-scale AI economically sensible.

The Vertical Advantage:

Google is the only major company that is competing fiercely and winning or coming close to the top in every critical layer of the AI stack:

1. Infrastructure (TPUs): Competing directly with NVIDIA on highly efficient, specialized AI silicon.
2. Foundation Models (Gemini, Imagen, Veo): Competing with OpenAI/Microsoft and Anthropic on core intelligence.
3. Applications (Nano Banana, AI Overviews): Integrating AI features into products that serve billions of users globally.

This end-to-end control, from the silicon chip to the final consumer application, provides a powerful strategic and economic advantage that is unmatched in the industry.

**3. The ChatGPT Myth and Search Dominance**

The idea that chatgpt will kill Google Search is a false narrative. Facebook, Instagram, TikTok, Reddit all were supposed to reduce google search queries. They have only grown. This new technology has made it much easier to ask any type or questions in any language. We were previously limited to what we would or could google. Now there are no limits. The more we know, the more questions we have and the more we search. Google search will be just fine.

I think ChatGPT will become another app on the phone where users will go to. I envision it as a personal assistant and less of search. But only time will tell.

Google was and will remain the gateway to the internet. The new AI business will be a net positive for Google by creating a new revenue stream through Google Cloud (GCP) and gemini features and subscriptions to its user base.

**4. The Financial Powerhouse and PE Hypothesis**

The fundamentals confirm this giant is firing on all cylinders.

* Net Income King: Alphabet's Trailing Twelve Months net income ending June 30, 2025, was $115.573 Billion, making it one of the most profitable companies in the world. This was more than MSFT $101.832 billion and APPL $99.280 billion
* Accelerating Triple-Threat Growth: All core segments - Google Cloud, Youtube and Google Search are growing at double-digit rates.

The core reason Google's Price-to-Earnings (PE) ratio is generally lower than many other tech companies is its revenue mix being heavily dominated by consumer advertising.

Simply put, investors are willing to pay a higher multiple (PE) for the more predictable, higher-margin, and rapidly growing recurring revenue streams typical of enterprise software and cloud platforms.

My hypothesis is with AI increasingly driving revenue through Google Cloud Platform (GCP), the enterprise segment will become a bigger component of Google's business mix, and hence, the company will earn a higher blended Price-to-Earnings (PE) ratio. This is because Enterprise and Cloud businesses are valued more highly, providing predictable, high-margin, recurring subscription revenue (SaaS), a financial profile superior to advertising. As this higher-multiple segment captures a greater share of Google's overall profit, the market will be forced to re-rate $GOOGL with a higher blended multiple, making the current valuation—which is depressed by the ad-centric multiple look like a significant undervaluation and a compelling investment opportunity.

TLDR : GOOGL is a generational buy. You're buying the best-in-class *present* (Search/Maps/YouTube), the scaled *near-future* (Waymo/GCP), and the *long-term future* (Quantum/AI Core Tech) at a discount.

https://preview.redd.it/h9doi0xnuywf1.png?width=1179&format=png&auto=webp&s=741748eadb6976d2ebf32a72f601343e6abc7d5c

Open Reddit thread

Gemini 3.1 Pro r/LocalLLaMA 685 upvotes 177 comments April 23, 2026

Qwen 3.6 27B Makes Huge Gains in Agency on Artificial Analysis - Ties with Sonnet 4.6

It is crazy that Qwen3.6 27B now matches Sonnet 4.6 on AA's Agentic Index, overtaking Gemini 3.1 Pro Preview, GPT 5.2 and 5.3 as well as MiniMax 2.7. It made gains across all three indices but the way the Coding Index works, I don't think the gains are as apparent as they should be. The Coding Index only uses Terminal Bench Hard and SciCode which are both strange choices. Cleary the training on the 3.6 models out now has focused on agentic use for OpenClaw/Hermes but it's interesting how close to frontier models such a small model can get. Qwen3.6 122B might be epic. . .

Open Reddit thread

Gemini 3.1 Pro r/DeepSeek 672 upvotes 162 comments March 2, 2026

Deepseek V4 - All Leaks and Infos for the Release Day - Not Verified!

**Deepseek V4** will probably release this week. Since I've already posted quite a lot about it here and I'm very hyped about V4, **I've summarized all the leaks. Everything is just leaked, unconfirmed**! Of course, everything could be different. If you have any new information or updates, please post them here! If you have different views or a different opinion, write them down too.

# DeepSeek V4 - Release

The release was originally expected for mid-February, alongside Gemini 3.1 Pro. However, DeepSeek has been delayed – this is not unusual and has happened multiple times before. The new release strongly points to **March 3rd** (Lantern Festival / 元宵节), but it could also be later in the week. The Financial Times reported on February 28th that V4 is coming "next week," timed to coincide with China's "Two Sessions" (两会) starting March 4th. DeepSeek's release pattern shows that new models often drop on **Tuesdays**. A short technical report is expected to be published simultaneously, with a full engineering report following about a month later.

# DeepSeek Delay History

DeepSeek delays regularly. Here's the pattern:

|Model|Originally Expected|Actual Release|Delay|
|:-|:-|:-|:-|
|DeepSeek-R1|Lite Preview Nov 2024, Full Version Dec 2024|January 20, 2025|\~4-8 weeks|
|DeepSeek-R2|May 2025 (according to reports)|Never released – replaced by R1-0528 update|Cancelled|
|DeepSeek-V3.1|Early Summer 2025 (expected)|August 21, 2025|Several months|
|DeepSeek-V3.2|Fall 2025 (expected)|December 1, 2025 (V3.2-Exp: Sep 29)|Weeks|
|DeepSeek-V4|\~February 17, 2026|\~March 3, 2026?|\~2 weeks|

# Architecture & Specifications – What Can We Expect?

**All unconfirmed! Much of this has been leaked but could turn out differently!**

# V4 Flagship – Main Model

|Specification|DeepSeek V3/V3.2|DeepSeek V4 (Leaks)|
|:-|:-|:-|
|Total Parameters|671B–685B MoE|\~1 Trillion (1T) MoE|
|Active Parameters/Token|\~37B|\~32B (fewer despite a larger model!)|
|Context Window|128K (since Feb '26: 1M)|1 Million Tokens (native)|
|Architecture|MoE + MLA|MoE + MLA + Engram Memory + mHC + DSA Lightning|
|Multimodal|No (text only)|Yes – Text, Image, Video, Audio (native)|
|Expert Routing|Top-2/Top-4 from 256 experts|16 experts active per token (from hundreds)|
|Hardware Optimization|Nvidia H800/H20 (CUDA)|Huawei Ascend + Cambricon (Nvidia secondary!)|
|Training|14.8T Tokens, H800 GPUs|Trained on Nvidia, inference optimized for Huawei|
|License|\-|\-|
|Input Modalities|Text|Text, Image, Video, Audio|
|Output Modalities|Text|Text (Image/Video generation unclear)|
|Estimated Input Price|$0.28/M Tokens|\~$0.14/M Tokens|
|Estimated Output Price|$0.42/M Tokens|\~$0.28/M Tokens|

# New Architecture Features (all backed by papers)

* **Engram Conditional Memory** (Paper: arXiv:2601.07372, Jan 13, 2026): O(1) hash lookup for static knowledge directly in DRAM. Saves GPU computation. 75% dynamic reasoning / 25% static lookups. Needle-in-a-Haystack: 97% vs. 84.2% with standard architectures
* **Manifold-Constrained Hyper-Connections (mHC)**: Solves training stability at 1T+ parameters. Separate paper published in January 2026
* **DSA Lightning Indexer**: Builds on V3.2-Exp's DeepSeek Sparse Attention. Fast preprocessing for 1M-token contexts, \~50% less compute

# DeepSeek V4 Lite (Codename: "sealion-lite")

A lighter variant has leaked alongside the flagship. At least one inference provider is testing the model under strict NDA.

|Specification|V4 Lite (Leak)|
|:-|:-|
|Parameters|\~200 Billion|
|Context Window|1M Tokens (native)|
|Multimodal|Yes (native)|
|Engram Memory|No (according to 36kr, not integrated)|
|vs. V3.2|"Significantly better" than current Web/App|
|Non-Thinking vs. V3.2 Thinking|Non-Thinking mode surpasses V3.2 Thinking mode|
|Status|NDA testing at inference providers|

# SVG Code Leak Examples

* **Xbox Controller**: 54 lines of SVG – highly detailed and efficient
* **Pelican on a Bicycle**: 42 lines of SVG – multi-element scene

According to internal evaluations: V4 Lite outperforms DeepSeek V3.2, Claude Opus 4.6 AND Gemini 3.1 in code optimization and visual accuracy.

# Leaked Benchmarks (NOT verified!)

**⚠️ IMPORTANT: All benchmark numbers come from internal leaks. The "83.7% SWE-bench" graphic circulating on X has been confirmed as FAKE (denied by the Epoch AI/FrontierMath team). The numbers below are the more conservative, more frequently cited leaks.**

|Benchmark|V4 (Leak)|V3.2|V3.2-Exp|Claude Opus 4.6|GPT-5.3 Codex|Qwen 3.5|
|:-|:-|:-|:-|:-|:-|:-|
|HumanEval (Code Gen)|\~90%|–|–|\~88%|**\~93%**|–|
|SWE-bench Verified|**>80%**|\~73.1%|67.8%|80.8%|80.0%|76.4%|
|Needle-in-a-Haystack|97% (Engram)|–|–|–|–|–|
|MMLU-Pro|TBD|85.0|–|85.8|–|–|
|GPQA Diamond|TBD|82.4|–|91.3|–|–|
|AIME 2025|TBD|93.1|–|87.2|–|–|
|Codeforces Rating|TBD|2386|–|2100|–|–|
|BrowseComp|TBD|51.4-67.6|40.1|84.0|–|–|

# Huawei & Hardware – The Geopolitical Dimension

* **Reuters (Feb 25)**: DeepSeek deliberately denied Nvidia and AMD access to the V4 model
* **Huawei Ascend + Cambricon** have early access for inference optimization
* Training was done on Nvidia hardware (H800), but **inference** is optimized for Chinese chips
* For the open-source community on Nvidia GPUs: performance could be **suboptimal** at launch
* This is an unprecedented hardware bet for a frontier model

# Price Comparison (estimated)

|Model|Input/1M Tokens|Output/1M Tokens|
|:-|:-|:-|
|DeepSeek V4 (estimated)|**\~$0.14**|**\~$0.28**|
|DeepSeek V3.2|$0.28|$0.42|
|Kimi K2.5|$0.60|$3.00|
|Gemini 3.1 Pro|$2.00|$12.00|
|Claude Opus 4.6|$5.00|$25.00|

If correct: V4 would be **36x cheaper** than Claude Opus 4.6 on input and **89x cheaper** on output.

# Open Questions

* Does V4 actually generate images/videos or just understand them?
* Will Nvidia GPU users get an optimized version?
* When will the open-source weights be released?

**Sources**: Financial Times, Reuters, CNBC, awesomeagents.ai, nxcode.io, FlashMLA GitHub, r/LocalLLaMA, Geeky Gadgets, 36kr

**Edit 03.03.2026**

The chance that the model will be released this week is relatively high, but not today. It is assumed that Deepseek will be released between March 3 and 5 if it is not published within the next 5 hours today. It will come in the next few days, as it then deviates from the release pattern (in terms of time).

**Edit 03.03.2026 Part 2**

The situation is becoming increasingly heated and tense, with an extremely large number of leaks and sources currently emerging. Collecting them all and verifying their credibility would take a very long time. However, a release is expected this week, with Wednesday or Thursday being the most likely dates.

**Edit 03.03.2026 Part 3 – Evening Update**

March 3rd (Lantern Festival) has passed without a release. However, in Beijing it is currently the early morning of March 4th, meaning the Chinese workday hasn't even started yet. A release on March 4th is still very much possible, especially since China's "Two Sessions" (两会) begin today.

What happened today:

1. **V4 Lite is being silently updated in production.** AIBase reported today that DeepSeek quietly pushed a new V4 Lite version tagged "0302". Community testers report a massive quality jump in logic, code generation, and aesthetics – now reportedly on par with Claude Sonnet 4.6. This strongly suggests DeepSeek is actively fine-tuning V4 models right before the official launch. (Source: AIBase)
2. **36kr published a new article** titled "The Entire Village Anticipates DeepSeek to Join for Dinner" – confirming the entire Chinese tech industry is waiting for V4. (Source: 36kr)

**Edit 04.03.2026 – Why not today, why Thursday is THE day**

March 4 passed without a release – and that makes strategic sense.

**Why not today:**

* CPPCC opening day = all Chinese media focused on politics, V4 would've been buried
* Shanghai Composite dropped 0.98% to 4,082 (4-week low) – bad sentiment to release into
* Beijing evening release window (8-10 PM BJT) has passed

**Why Thursday March 5 is the perfect storm:**

* **NPC opens tomorrow morning** – Premier Li Qiang delivers Government Work Report with AI & tech as centerpiece of the new Five-Year Plan. Morning: politics declares AI a national priority → Evening: DeepSeek delivers the proof
* **BYD "disruptive technology" event same day** – DiPilot 5.0, Blade 2.0, DM 6.0 reveal. Global headline: "China showcases two AI breakthroughs in one day"
* **Market timing** – Shanghai closes 3 PM BJT, evening release gives markets overnight to digest, Friday opens with V4 hype
* **Developer weekend** – Thursday drop = Fri + Sat + Sun to test & benchmark

**Expected release window:**

|Release|Beijing Time|UTC|
|:-|:-|:-|
|R1 (Jan 2025)|\~10-11 PM|\~2-3 PM|
|V3.2 (Nov 2025)|\~12 AM|\~4 PM|
|**V4 (expected)**|**8-11 PM**|**12-3 PM**|

**If Thursday doesn't happen?**

* Friday = bad release day (weekend kills momentum, DeepSeek has never released on a Friday)
* Next window: Monday/Tuesday March 9-10
* But: silent V4 Lite "0302" production update + 36kr's "The Entire Village Anticipates DeepSeek" article suggest we're in final hours, not days

**Edit 05.03.2026**

It has to happen today. Deepseek Web was down for 40 minutes, but it hasn't been down for the last 30 days, and it was the same before the big launch of V3 and R1. In addition, today is the BYD event Deepseek Partner. It will happen in the next few hours, and if not, then Deepseek has missed the best window of opportunity they could ever have had.

**Edit 05.03.2026 Part 2**

**The model will not be released this week or probably next week. Although DeepSee v4 has been ready for a long time and there were really only a few minor issues left, the model would have been released last week or this week. Is there a major delay due to the government, because at the last minute they said that deepseek is not allowed to release the model as long as it does not run on Chinese hardware, but the model was trained on Nvidia, so such a restructuring naturally takes time, because the new technology in V4 was completely for Nvidia and not for Huawei, and I think we still know what happened with R2...**

**Edit 07.03.2026**

When will Deepseek be released? After all the leaks, news, and crisis status, Deepseek V4 will and must come and cannot end like R2. The Chinese government has gone too far with its AI and told the US that it no longer needs it, whereupon Trump, in order not to appear weak, wants to impose a ban that will allow him to control all chip trade (meaning no more chips to China).

However, BYD and China have praised Deepseek too much in recent days. If V4 ended up like R2 and didn't come out at all, China would look extremely foolish, which the government would never allow.

That's why I suspect that Deepseek will receive help from the Chinese government (in recent years, Deepseek's CEO has been in frequent talks with the government and has received support from it) and will no longer adhere to any release pattern, as Deepseek has already missed three good release windows. My guess is that they will release it when it is least expected, which could be this weekend. (V3.2 was released on Sunday) In order to weaken and expose Nvidia and the entire US market with new AI technology.

Deepseek waiting until Claude or other providers are ready is incorrect and highly unlikely. Deepseek has problems and needs to fix them before release. V4 is already 90% complete (Lite has been corrected several times and is said to be just as intelligent as Sonnet 4.6). We also know that Deepseek's CEO is a perfectionist and would never release a half-finished product or leave it unfinished, as was the case with the GLM-5 release

**🚨 UPDATE 11.03.2026 – 22:00 CET – V4 WEIGHTS SPOTTED**

Major development: Chinese quantization expert u/bdsqlsz (青龍聖者) on X was spotted uploading **DeepSeek-V4-INT8** model shards to HuggingFace with the caption "it is coming." The upload shows multiple `model-0...` shards, a `.gitattributes`, and a [`README.md`](http://README.md) — indicating a full model repo creation.

**Why this is significant:**

* u/bdsqlsz is a verified, well-known quantization specialist — not a random account
* INT8 quantization requires access to the **full original weights** first
* Historically, community quants appear **within hours** of official weight releases (V3: same day, R1: same day, V3.2: within 24h)
* This means the official FP8/BF16 weights either already exist on HuggingFace (possibly private/unlisted) or u/bdsqlsz has NDA access

**Full leaked specs now confirmed:**

* \~1 Trillion parameters (MoE), \~32B active per token
* 1M native context window
* Multimodal: text + vision + audio
* Huawei Ascend 910C optimized
* MIT License

**Previous delays explained:** Huawei Ascend inference optimization (only 80% Nvidia efficiency), Blackwell chip fingerprint removal, and CEO Liang Wenfeng's perfectionism. The 40-min web outage on March 5 was likely a deployment test.

**My prediction: Official release within 24-72 hours.** The weights exist. The upload is happening. Keep your monitors running.

⚠️ UPDATE 11.03 – Unverified leak: u/bdsqlsz posted V4-INT8 weight uploads on X. r/LocalLLaMA is split – top comment (193 upvotes) questions authenticity. The file structure looks technically correct and INT8 aligns with Huawei optimization rumors, but previous V4 benchmark leaks in February were confirmed fake. Treat with caution until official deepseek-ai repo appears on HuggingFace."

Will update when it drops. 🚀

Open Reddit thread

Gemini 2.5 Flash Image r/singularity 627 upvotes 54 comments September 2, 2025

Google is now officially calling "Gemini 2.5 Flash image preview", "Nano Banana"

Open Reddit thread

Gemini 2.5 Flash Image r/singularity 467 upvotes 8 comments August 26, 2025

Google's new Gemini 2.5 Flash Image model can do some very impressive high-level image edits

Open Reddit thread

View more discussions →

AI tools related to Gemini 3.1 Pro vs Gemini 2.5 Flash Image

These tools are closely connected to one or both models in this comparison and can help you evaluate real-world fit.

AI Image Enhancer

BeautyPlus

BeautyPlus: BeautyPlus is an AI-powered online platform offering a comprehensive suite of image and video editing tools. It features an AI Image Enhancer to improve photo quality, resolution, color, and contrast, and includes advanced functionalities like blurry photo correction, noise reduction, and blemish minimization. Additionally, it integrates Nano Banana Pro, an AI image generator and editor powered by Google Gemini 3 Pro, enabling users to generate images from text, edit existing images with prompts, and combine elements from multiple images. The platform also provides various other tools such as background removers, object removers, AI filters, video enhancers, and more, catering to both professional and casual users for diverse creative needs.

1 visits

Large Language Models (LLMs)

googlegemini.co

googlegemini.co is a free tool for interacting with text and images, powered by the Google Gemini Pro API. It allows you to use Gemini easily without managing your own server or API configurations. Google Gemini is a multimodal AI developed by DeepMind capable of processing text, audio, images, and more. It is optimized for various devices, performs well on AI benchmarks, and is built with a focus on safety and responsible AI practices.

Free 0 visits 2 saves

AI Assistant

GeminiGoogle.cc

GeminiGoogle.cc is a platform dedicated to showcasing Google's most advanced AI model, Gemini. Built for native multimodality, Gemini reasons across text, images, video, audio, and code. It is available in three versions—Ultra, Pro, and Nano—to support tasks ranging from complex reasoning to on-device efficiency. The site highlights Gemini's performance, including its MMLU benchmarks, and provides examples of its capabilities in image generation, problem-solving, and multimodal analysis.

Free 0 visits 2 saves

AI Summarizer

Summarize and Translate Web Pages - Chrome Extension

The Summarize and Translate Web Pages Chrome extension enables you to summarize and translate web content with a single click. Powered by Google's Gemini AI, this tool provides high-quality summaries and translations for web pages, selected text, YouTube video captions, images, and PDF files.

Free

Which model should you choose?

Use the summary below to decide which model better fits your workflow, budget, and feature requirements.

Best fit for

Gemini 3.1 Pro

Gemini 3.1 Pro is a stronger fit for long-context workloads, reasoning-heavy tasks, tool-augmented workflows.

Best fit for

Gemini 2.5 Flash Image

Gemini 2.5 Flash Image is a stronger fit for long-context workloads, multimodal applications, cost-efficient scale.

Verdict

Choose Gemini 3.1 Pro if you prioritize long-context workloads, reasoning-heavy tasks, tool-augmented workflows. Choose Gemini 2.5 Flash Image if your workflow depends more on long-context workloads, multimodal applications, cost-efficient scale.

FAQ

Common questions about Gemini 3.1 Pro vs Gemini 2.5 Flash Image

What is the main difference between Gemini 3.1 Pro and Gemini 2.5 Flash Image?

Gemini 3.1 Pro leans toward long-context workloads, reasoning-heavy tasks, tool-augmented workflows, while Gemini 2.5 Flash Image is better suited to long-context workloads, multimodal applications, cost-efficient scale.

Which model is cheaper: Gemini 3.1 Pro or Gemini 2.5 Flash Image?

Gemini 2.5 Flash Image starts lower on input pricing at $0.3000 per 1M input tokens, compared with $2.0000 for Gemini 3.1 Pro.

Which model has the larger context window: Gemini 3.1 Pro or Gemini 2.5 Flash Image?

Gemini 3.1 Pro is listed with a context window of 1,048,576, while Gemini 2.5 Flash Image is listed with 1,048,576.

How should I evaluate Gemini 3.1 Pro vs Gemini 2.5 Flash Image for my use case?

This comparison currently includes 12 shared benchmark rows, helping you compare practical performance across overlapping evaluations.

Gemini 3.1 Pro vs Gemini 2.5 Flash Image

Overview Comparison

Provider

Model ID

Input Context Window

Maximum Output Tokens

Open Source

Release Date

Knowledge Cut-off Date

API Providers

Modalities

Pricing Comparison

Capabilities Comparison

Benchmark Comparison

What Reddit discussions say about Gemini 3.1 Pro vs Gemini 2.5 Flash Image

AI tools related to Gemini 3.1 Pro vs Gemini 2.5 Flash Image

BeautyPlus

googlegemini.co

GeminiGoogle.cc

Summarize and Translate Web Pages - Chrome Extension

Which model should you choose?

Gemini 3.1 Pro

Gemini 2.5 Flash Image

Common questions about Gemini 3.1 Pro vs Gemini 2.5 Flash Image

Related comparisons