Back to news

Model Review

The $1-per-million club: Cheapest capable models.

Which models deliver real capability for under $1 per million input tokens? We rank DeepSeek V3.5 ($0.15), Gemini 3.5 Flash ($0.35), Qwen 3 ($0.40), and GPT-5.5 Instant ($0.50).

AI Kick Start editorial image for The $1-per-million club: Cheapest capable models.

Decision

Shortlist

Score tools by workflow fit, data handling, owner readiness, and cost at scale before buying seats.

Risk to watch

Shelfware

A capable tool still fails if nobody owns the workflow or checks whether it is used weekly.

Proof to collect

Pilot score

Run one real task through each shortlisted tool and record quality, time saved, and support burden.

TL;DR

TL;DR: AI inference keeps getting cheaper, and a band of models now reportedly runs under $1 per million input tokens while still doing real work. The figures circulating for specific models in mid-2026 are unconfirmed and, in several cases, don't match official pricing we could find. Treat the rankings below as a way to think about value, not as a price list to buy from. If you're processing a lot of text on a budget, the cheap tier is genuinely worth a look. Just check the live pricing page before you commit.

Key takeaways

  • The $1-per-million club: Cheapest capable models:
  • Analysis: A year ago, running a million tokens through a decent model cost real money.
  • The contenders: DeepSeek V3.5: $0.15 / 1M: $0.60 / 1M: 52.4%: 85.8%: 1M Gemini 3.5 Flash: $0.35 / 1M: $0.70 / 1M: 48.2%: 86.8%: 1M Qwen 3: $0.40 / 1M: $1.20 / 1M: 46.2%: 84.6%: 128K GPT-5.5 Instant: $0.50 / 1M: $1.50 / 1M: 42.1%: 84.2%: 128K A caution before you read the table as gospel: we couldn't confirm most of these figures, and a few clash with what the vendors publish.
  • Ranking by value: **1.
  • The price-performance curve: The headline point survives the messy details: the gap between cheap and capable has narrowed sharply.

The $1-per-million club: Cheapest capable models

Analysis

A year ago, running a million tokens through a decent model cost real money. Now the floor has dropped so far that "good enough" has stopped being expensive. That's the story underneath all the numbers: the cheap end of the market has caught up to where the frontier sat not long ago, and for a lot of everyday business work, you no longer need to pay flagship prices.

That's the genuine shift. The catch is the specifics. A list has been doing the rounds claiming exactly four models now sit under $1 per million input tokens, each with neat benchmark scores to match. When we went looking, most of those figures didn't line up with what the vendors actually publish. Some of the models don't appear to exist under the names given. So read what follows as a map of how to weigh cheap models against each other, not as a verified shopping list.

Here's why it matters for your team: if you're doing bulk document work, monitoring, classification, or first-draft generation, the cost difference between the cheap tier and a flagship model is large enough to change what's worth automating at all. The trick is matching the model to the job and confirming today's price yourself.

The contenders

ModelInput PriceOutput PriceSWE-bench ProMMLUContext
DeepSeek V3.5$0.15 / 1M$0.60 / 1M52.4%85.8%1M
Gemini 3.5 Flash$0.35 / 1M$0.70 / 1M48.2%86.8%1M
Qwen 3$0.40 / 1M$1.20 / 1M46.2%84.6%128K
GPT-5.5 Instant$0.50 / 1M$1.50 / 1M42.1%84.2%128K

A caution before you read the table as gospel: we couldn't confirm most of these figures, and a few clash with what the vendors publish. There's no "DeepSeek V3.5" in DeepSeek's own change log, the documented line runs V3, V3.1, V3.2 and V4, with V3.2 priced nearer $0.28 input / $0.42 output on a roughly 131K context. Gemini 3.5 Flash is real, but reported I/O 2026 pricing lands closer to $1.50 input, which would put it above the sub-$1 line, not under it. And while GPT-5.5 shipped in April 2026, its standard API pricing is reported around $5 / $30 per million on a 1M context, nowhere near $0.50 / $1.50. The SWE-bench Pro scores below are also unconfirmed and sit lower than the public leaderboard numbers we'd expect for named flagships. So take the ranking as reasoning about value, and verify the live numbers before you spend anything.

Ranking by value

1. DeepSeek V3.5, Best overall value. On the figures quoted ($0.15/$0.60, 1M context, 52.4% SWE-bench Pro, 85.8% MMLU), this would be the cheapest capable option with the longest context, and the open licence sweetens it further. Worth flagging: a model under exactly that name and price doesn't appear in DeepSeek's docs, so the real-world equivalent is more likely V3.2 or V4. Either way, for bulk document processing, monitoring and analysis, DeepSeek's cheap tier is hard to beat on cost per token.

2. Gemini 3.5 Flash, Best balance. The pitch is a small premium over DeepSeek in exchange for a higher MMLU (86.8%) and Google's production reliability. The $0.35/$0.70 pricing quoted here is the part to double-check, reported figures are several times higher, which would knock it out of the sub-$1 club entirely. If the cheap price holds where you are, Flash is a sensible default for production. If it doesn't, the reliability argument still stands, just at a higher cost.

3. Qwen 3, Best for multilingual. At a quoted $0.40/$1.20, Qwen 3 reads as the cheapest route for Asian-language work. The exact SKU and the 128K context are unconfirmed, the current Qwen line tends to ship with much larger windows, but the multilingual strength is the real draw here. If your workload leans into non-English content, this is the one to trial.

4. GPT-5.5 Instant, Best ecosystem integration. Instant is pitched as the priciest and weakest of the four on paper, earning its place through tight integration with OpenAI's platform. The $0.50/$1.50, 128K-context SKU quoted here isn't one we could verify against OpenAI's published pricing, which runs far higher. If you're already inside OpenAI's tooling, it's the path of least friction, just don't assume the cheap price tag.

The price-performance curve

The headline point survives the messy details: the gap between cheap and capable has narrowed sharply. The quoted scores put all four above 42% on SWE-bench Pro and above 84% on MMLU. The MMLU range is broadly believable for capable 2026 models; the SWE-bench numbers we couldn't confirm. The MMLU framing is the safer one to lean on, that level of general knowledge would have read as frontier performance a couple of years back, and now it's showing up at budget prices. The cheapening of AI is real even if these particular figures aren't nailed down.

Verdict

If you want maximum value, start by trialling DeepSeek's cheapest current model, but check which SKU is actually live, since "V3.5" may not be it. Step up to Gemini 3.5 Flash if you need Google's infrastructure or a bit more general knowledge, and confirm the price first, because it may sit above the sub-$1 line. Reach for Qwen 3 for multilingual work, and pick GPT-5.5 Instant only if you're already committed to OpenAI's stack. Across all of them, the rule is the same: verify today's pricing on the vendor's own page before you build anything on top of it.

Best value: DeepSeek V3.5

Source trail

Primary references to keep this briefing grounded

AI and automation information changes quickly. Use these official or primary references to verify the claims, pricing, product behaviour, and compliance details before committing budget or production data.

What to do next

  1. Write the job-to-be-done before looking at another product.
  2. Score each shortlisted tool for workflow fit, data handling, cost, and owner readiness.
  3. Run one small pilot and remove anything the team does not use weekly.

Want help applying this? Explore the AI tools directory.

AI Kick Start is an Illawarra-based AI studio in Figtree, helping businesses across Wollongong, Shellharbour and Kiama and right across Australia put AI to work.

Explore with AI

Use the article as a decision prompt

Summarise this AI Kick Start article for an Australian business owner. Focus on the useful decision, the risks, and the first practical next step: The $1-per-million club: Cheapest capable models

Turn this into a practical roadmap.

Use the guide as a starting point, then map the first workflow worth building.

Book an AI strategy call