Back to news

Model Review

MiniMax M3 vs DeepSeek V3.5: Best open-weights model?

MiniMax M3 ($0.30/$1.20, 59.0% SWE-bench Pro) vs DeepSeek V3.5 ($0.15/$0.60, 52.4% SWE-bench Pro). Both have 1M context and open licences. Which open model should you choose?

AI Kick Start editorial image for MiniMax M3 vs DeepSeek V3.5: Best open-weights model?.

Decision

Shortlist

Score tools by workflow fit, data handling, owner readiness, and cost at scale before buying seats.

Risk to watch

Shelfware

A capable tool still fails if nobody owns the workflow or checks whether it is used weekly.

Proof to collect

Pilot score

Run one real task through each shortlisted tool and record quality, time saved, and support burden.

TL;DR

TL;DR: MiniMax M3 ($0.30/$1.20, 59.0% SWE-bench Pro) vs DeepSeek V3.5 ($0.15/$0.60, 52.4% SWE-bench Pro). Both have 1M context and open licences. Which open model should you choose?

Key takeaways

  • MiniMax M3 vs DeepSeek V3.5: Best open-weights model?: A quick warning before you read on: the original draft of this comparison rests on a model that does not appear to exist.
  • Head-to-head benchmarks: SWE-bench Pro: 59.0%: 52.4% (unconfirmed): +6.6 pts (MiniMax) MMLU: 86.4% (unverified): 85.8% (unconfirmed): +0.6 pts (MiniMax) Context window: 1M: 1M: , Price (input): $0.30 / 1M: $0.15 / 1M (unconfirmed): DeepSeek 2x cheaper Price (output): $1.20 / 1M: $0.60 / 1M (unconfirmed): DeepSeek 2x cheaper Licence: Open: Open: , One row holds up cleanly.
  • Where MiniMax M3 wins: **Coding.** MiniMax M3's SWE-bench Pro result is the strongest claim in the piece.
  • Where DeepSeek V3.5 wins: **Price.** This whole section depends on the unconfirmed $0.15/$0.60 figure, so take it lightly.
  • The 1M context parity: The 1M context point is the one part of the comparison that survives, even if the labelling is off.

MiniMax M3 vs DeepSeek V3.5: Best open-weights model?

A quick warning before you read on: the original draft of this comparison rests on a model that does not appear to exist. "DeepSeek V3.5" returns no real release. DeepSeek's actual 2026 open-weights lineup is V3.2 and V4 (V4-Pro and V4-Flash, shipped 24 April 2026 under MIT). So treat every "DeepSeek V3.5" figure below as unconfirmed and almost certainly mislabelled. The MiniMax M3 numbers, by contrast, mostly check out.

With that caveat in place: the pitch was that MiniMax M3 and "DeepSeek V3.5" were the two strongest open-weights models you could run in June 2026, both with 1M-token context windows, both openly licensed, and both priced far below the closed models from OpenAI and Google. The real story is narrower. MiniMax M3 is genuine, released 1 June 2026, open-weight, 1M context. Its sparring partner here is not.

For Australian teams weighing an open model to self-host or run cheaply through an API, that distinction matters. You can act on the MiniMax M3 details. The DeepSeek side needs to be re-checked against V4-Pro or V4-Flash before you put a dollar behind it.

Head-to-head benchmarks

MetricMiniMax M3DeepSeek V3.5 (unverified)Delta
SWE-bench Pro59.0%52.4% (unconfirmed)+6.6 pts (MiniMax)
MMLU86.4% (unverified)85.8% (unconfirmed)+0.6 pts (MiniMax)
Context window1M1M,
Price (input)$0.30 / 1M$0.15 / 1M (unconfirmed)DeepSeek 2x cheaper
Price (output)$1.20 / 1M$0.60 / 1M (unconfirmed)DeepSeek 2x cheaper
LicenceOpenOpen,

One row holds up cleanly. OpenRouter lists MiniMax M3 at $0.30 per 1M input tokens and $1.20 per 1M output, which matches the table. The DeepSeek pricing of $0.15/$0.60 lines up with no real DeepSeek model: V4-Flash sits at roughly $0.14/$0.28, V4-Pro at about $0.435/$0.87, and V3.2 near $0.23/$0.34. So the "2x cheaper" framing rests on a price that does not exist.

Where MiniMax M3 wins

Coding. MiniMax M3's SWE-bench Pro result is the strongest claim in the piece. VentureBeat reports M3 at 59.0% on SWE-bench Pro, narrowly ahead of GPT-5.5 at 58.6%. Worth noting that this is a vendor-run benchmark, so read it as MiniMax's own scorecard rather than an independent audit. The 6.6-point lead over "DeepSeek V3.5" should be ignored, since the comparison model is fictional. If you want a real benchmark fight, line M3 up against DeepSeek V4-Pro, which third-party reviews put around 55.4% on the same test.

General knowledge. The 86.4% MMLU figure for M3 is unverified, no public source confirms it, and most M3 coverage focuses on coding and agentic tasks rather than MMLU. The supposed 0.6-point edge over "DeepSeek V3.5" is meaningless given the other number is attached to a model that does not exist.

Where DeepSeek V3.5 wins

Price. This whole section depends on the unconfirmed $0.15/$0.60 figure, so take it lightly. The original argument was that DeepSeek would be half the price of MiniMax M3, and that the gap compounds on high-volume work like document processing, content analysis, or monitoring. The worked example claimed 100M tokens would cost $15,000 on MiniMax M3 versus $7,500 on DeepSeek. That sum is built on the fabricated DeepSeek pricing and an unusual assumption that bills all 100M tokens at output-style rates, so it does not hold up. If cost is your deciding factor, price it against a real DeepSeek model, V4-Flash in particular is genuinely cheap.

Inference efficiency. The draft claimed DeepSeek was more parameter-efficient and pushed higher throughput on the same hardware "in our testing." There's no published benchmark behind that, and again it points at a model that does not exist, so treat it as unconfirmed. The real DeepSeek V4 does lean on Compressed Sparse Attention for efficiency, but that's a different model and a different claim.

The 1M context parity

The 1M context point is the one part of the comparison that survives, even if the labelling is off. MiniMax M3's 1M-token window is confirmed. DeepSeek's real current model, V4, also ships a 1M-token window, so the parity is genuine, it's just that the matching model is V4, not the "V3.5" named here. The claim that both held accuracy at the far edges of their context windows came from the author's own long-context testing, with no methodology or independent eval attached, so treat that as unconfirmed too.

Verdict

Strip out the fiction and what's left is one model you can actually evaluate. MiniMax M3 is real, openly licensed, runs a confirmed 1M context, lists at $0.30/$1.20 on OpenRouter, and posts a vendor-reported 59.0% on SWE-bench Pro. That's a credible open coding model on its own terms.

The "DeepSeek V3.5" half of this comparison should not drive any decision. If you're shopping DeepSeek, look at V4-Pro, V4-Flash, or V3.2 and pull their real prices and benchmarks before you commit. The headline question, which open-weights model is best, is worth asking, but it needs two models that exist to answer it.

Winner: MiniMax M3 (the only verifiable contender here) / "DeepSeek V3.5" comparison unconfirmed

Source trail

Primary references to keep this briefing grounded

AI and automation information changes quickly. Use these official or primary references to verify the claims, pricing, product behaviour, and compliance details before committing budget or production data.

What to do next

  1. Write the job-to-be-done before looking at another product.
  2. Score each shortlisted tool for workflow fit, data handling, cost, and owner readiness.
  3. Run one small pilot and remove anything the team does not use weekly.

Want help applying this? Explore the AI tools directory.

AI Kick Start is an Illawarra-based AI studio in Figtree, helping businesses across Wollongong, Shellharbour and Kiama and right across Australia put AI to work.

Explore with AI

Use the article as a decision prompt

Summarise this AI Kick Start article for an Australian business owner. Focus on the useful decision, the risks, and the first practical next step: MiniMax M3 vs DeepSeek V3.5: Best open-weights model?

Turn this into a practical roadmap.

Use the guide as a starting point, then map the first workflow worth building.

Book an AI strategy call