DeepSeek V3.5 review: $0.15 per million input tokens
Release date: 20 March 2026 (reported, unconfirmed) | Status: Active | Licence: Open
Editor's note: We could not confirm that a model called "DeepSeek V3.5" exists. Public DeepSeek pricing pages, the official DeepSeek API docs, and independent trackers show the lineup going from V3.2 (1 December 2025) straight to V4 (24 April 2026), with no V3.5 and no 20 March 2026 release. The figures below, pricing, benchmarks, the 1M context window, and at least one competitor name, could not be tied to any released model. Treat this piece as a reported-but-unverified product profile, not a confirmed review.
If you only read one line about the cheap-model race in 2026, it's this: input pricing has fallen far enough that workloads nobody could afford a year ago are suddenly on the table. Reading a full document archive into a model. Watching a codebase around the clock. Chewing through a social feed in real time. The cost of "just feed it everything" has collapsed.
A model going by the name DeepSeek V3.5 is one of the names attached to that shift, with a reported price of $0.15 per million input tokens and $0.60 per million output, open weights, and a 1M-token context window. On paper, that combination undercuts most of the field. We say "reported" because, as the note above flags, we couldn't find evidence the model exists under that name, DeepSeek's real lineup appears to jump from V3.2 to V4.
So the honest framing is this. What follows describes the model as it has been presented to us. The numbers are striking. They are also unconfirmed, and the central subject may be a mix-up with a real DeepSeek release. Read it that way, and the broader point still holds: the cheap end of the market is where the interesting economics are happening.
The question worth asking isn't whether a model like this is cheap. It's whether something this cheap is good enough to trust with real work.
Benchmarks at a glance
| Metric | Score | Price Context |
|---|---|---|
| SWE-bench Pro | 52.4% | Decent for the price |
| MMLU | 85.8% | Strong general knowledge |
| Context window | 1M tokens | Best-in-class |
| Price (input) | $0.15 / 1M tokens | Cheapest input in survey |
| Price (output) | $0.60 / 1M tokens | Very cheap |
| Licence | Open | Self-hostable |
These figures are as reported and could not be verified against a primary source. For context, independent trackers report different numbers for DeepSeek's actual models (Artificial Analysis, pricepertoken), V3.2 sits around $0.23 input and $0.34 output, while V4 Flash is reportedly closer to $0.14/$0.28.
The pricing analysis
The reported input price of $0.15 per million tokens would be the lowest in our survey. Put in plain terms: processing one billion input tokens would cost $150. On Gemini 3.5 Flash, itself a value pick, the same volume reportedly runs about $350 (Artificial Analysis; the specific per-token figure is unconfirmed). On Opus 4.8, the comparison figure is around $5,000, also unverified.
At those prices, jobs that used to be uneconomical start to make sense: reading an entire corporate document archive, processing a social media firehose, running continuous checks over a large codebase. A 1M-token context window, again, reported rather than confirmed, would push that further, letting you ingest a large document in one pass without much cost.
Capabilities
A reported 52.4% on SWE-bench Pro would put this model in the middle of the open-weights pack. It would handle routine coding fine, ahead of Qwen 3 (46.2%) and Llama 4 (50.2%), behind Kimi K2.7-Code (56.8%) and MiniMax M3 (59.0%). Worth a caveat here: those comparison scores are unverified, and we found no evidence that a model called "Kimi K2.7-Code" exists, the real Moonshot release appears to be Kimi K2.6. The reported 85.8% MMLU is strong on paper, matching or beating several pricier models, but it too is unconfirmed.
The open-weights advantage
Like the other models in this review, an open licence would mean you can self-host for sensitive workloads. The model is reportedly available in several quantisations, from Q4 through FP16. On the setups we've seen described, a Q5_K_M quant on a single A100 40GB lands as a sensible balance of quality and speed for batch processing. The quantisation tiers and the hardware are real, everyday concepts; the specific fit for this particular model is unverified, since we couldn't confirm the model itself.
Verdict
If the reported specs held up, DeepSeek V3.5 would be the value pick for high-volume, context-heavy work. Not the best coder, not the best reasoner, but at a reported $0.15/$0.60 with a 1M context and open weights, it wouldn't need to be. For budget-conscious teams with large document or code analysis needs, it would be an easy call.
The catch is the one we opened with: we couldn't verify that this model exists as described. Before you build anything on it, check the live DeepSeek pricing and model docs and confirm you're looking at a real, released model, V3.2 or V4, rather than a name that doesn't match the current lineup.
Score: 8.3 / 10 (on the reported specs; treat as provisional given the verification problems above)



