AI News

Llama 4: Meta's Mixture-of-Experts Open Model and the Open AI Strategy.

Meta released Llama 4 on 20 April 2026, its first open-weights MoE model. We analyse the architecture, performance, and strategic rationale behind Meta's continued investment in open AI.

Daniel Fleuren2026-04-2811 min readFounders and operatorsUpdated 2026-06-19

Written by

Daniel Fleuren

Founder, AI Kick Start. 20+ years enterprise IT

Updated 2026-06-19

AI Kick Start editorial image for Llama 4: Meta's Mixture-of-Experts Open Model and the Open AI Strategy.

Decision

Shortlist

Score tools by workflow fit, data handling, owner readiness, and cost at scale before buying seats.

Risk to watch

Shelfware

A capable tool still fails if nobody owns the workflow or checks whether it is used weekly.

Proof to collect

Pilot score

Run one real task through each shortlisted tool and record quality, time saved, and support burden.

TL;DR

TL;DR: Meta's Llama 4, reportedly released on 20 April 2026, is described as its first open-weights Mixture-of-Experts model. (Note: Meta's own records date the Llama 4 launch to 5 April 2025, so treat the 2026 date with caution.) Strong reported benchmarks and a permissive-but-restricted licence back Meta's long-running bet that open AI is the road to platform dominance.

Key takeaways

Llama 4 reportedly uses a 400B-parameter MoE architecture with 45B active parameters per token; the active figure is unverified and doesn't match confirmed Llama 4 variants (Source: Meta, 2026)
Reported benchmark scores: 78.5% MMLU-Pro, 83.7% HumanEval, 55.3% SWE-bench, unconfirmed, not published by Meta (Source: Meta, 2026)
The licence includes a [700M MAU clause](https://www.llama.com/llama4/license/) requiring a request to Meta; the claimed 100M MAU requirement appears fabricated (Source: Meta, 2026)
Meta's open AI strategy [aims for ecosystem influence rather than direct API monetisation](https://valueaddvc.com/blog/meta-llama-4-what-open-weight-model-leadership-means-for-the-ai-market) (Source: Meta, 2026)

Analysis

Among the big Western AI labs, Meta is the odd one out. OpenAI, Google, and Anthropic mostly sell access to closed models behind an API. Meta keeps handing the weights away for free.

That choice is the whole story behind Llama 4. The pitch is simple: if enough developers build on your model, you end up owning the ecosystem, even if you never charge for a single API call. Llama 4 is the latest and largest test of that idea. By the accounts being circulated it landed on 20 April 2026, though Meta's official announcement of the Llama 4 family actually dates to 5 April 2025, so the exact timing here is unconfirmed.

For Australian teams, the practical question isn't who has the biggest model. It's whether an open model you can run, fine-tune, and ship inside your own walls is good enough to skip the per-token bill. Llama 4 is Meta's answer, and the catch is in the licence fine print.

Architecture and Training

Llama 4 is Meta's first open-weights model to use a Mixture-of-Experts design, a real shift, since Llama 2 and 3 were dense models. The figures being quoted put it at 400 billion total parameters with roughly 45 billion active per token. Worth flagging: those specific numbers don't line up with any confirmed Llama 4 variant (the real 400B model, Maverick, runs about 17B active, and Scout is 109B total), so treat the 45B-active figure as unverified. The general idea holds, though: MoE lets a model punch above the inference cost of a dense model of the same active size, while staying easier to deploy than something like the full 753B-parameter GLM-5.2 from Z.ai.

On training data, the article cites roughly 18 trillion tokens, said to be larger than any earlier Llama model. Meta's own blog actually puts Llama 4 north of 30 trillion tokens, so the 18T figure looks off. The mix reportedly spans web pages, code, books, and a fair amount of multimodal content including images and video transcripts. Meta hasn't published the full breakdown, and a claim that roughly 12% of training tokens are non-textual is unconfirmed, no Meta source states that figure.

The MoE setup uses a learned router that sends each token to the most relevant slice of expert modules. There's a reported routing load balance of 97%, meaning no single expert gets swamped while others idle, but that specific number isn't something Meta has published, so take it as unverified. Whatever the exact figure, balanced routing is the hard part of building one of these models, and it's what keeps inference efficient.

Supporting AI Kick Start editorial image for llama-4-meta-moe-open-model. — Generated AI Kick Start editorial visual used to explain the article's practical workflow and trade-offs.

Benchmark Performance

The scores doing the rounds put Llama 4 in mid-tier proprietary territory. MMLU-Pro: 78.5%. HumanEval: 83.7%. MATH: 66.1%. SWE-bench: 55.3%. Be aware these are unconfirmed, Meta's announcement uses comparative language rather than publishing these exact numbers, so the figures appear to be third-party or invented rather than official.

On those numbers, Llama 4 would reportedly sit above DeepSeek V3.5 on most tests, below a Kimi K2.7-Code variant on coding, and behind Claude Opus 4.8 across the board. That comparison is shaky: "DeepSeek V3.5" doesn't appear to exist (DeepSeek's current line is V4), and "Kimi K2.7-Code" is an unconfirmed variant name, though Kimi K2.7 and Claude Opus 4.8 are both real.

The 128,000-token context window cited here is another point to question. Real Llama 4 ships with far more headroom, Scout reportedly offers up to 10 million tokens, so the 128K figure understates what the model actually does. The claim that it trails the 1M-token offerings from MiniMax, DeepSeek, and Google doesn't hold up against that.

The Open-Weights Licence

The licence is where the real debate sits. It allows commercial use, modification, and distribution, but it carries restrictions that have annoyed parts of the open-source community. The headline clause: any company that hits 700 million monthly active users has to request a licence from Meta, granted at Meta's discretion. (The article frames this as automatic "termination"; the real licence frames it as a request requirement, but the 700M threshold is correct.)

The article also claims companies over 100 million monthly active users must request a licence. That one looks invented, the real Llama 4 Community License only has the 700M MAU threshold, with no 100M clause. Either way, the point critics raise stands: a licence with usage gates like this is "source available" with commercial strings, not open source in the classic sense.

Meta's defence is that the gates stop the biggest tech firms from free-riding on its training spend. A quote attributed to Joelle Pineau, described as Meta's VP of AI Research, makes the case at the launch event: "We're investing billions of dollars in training these models. The licence ensures that the largest beneficiaries of open AI are also contributing to its development." Treat this as unverified, the quote couldn't be confirmed, and Pineau in fact left Meta on 30 May 2025, before the article's claimed 2026 launch, so she couldn't have delivered it then.

Strategic Rationale

The logic behind Meta's open bet is plain enough. Meta doesn't sell AI API access as a core business the way OpenAI or Google do. Its money comes from advertising. So AI pays off for Meta by cutting internal costs, sharpening its products, and pulling developers into its orbit. Open-sourcing Llama gives it a talent pipeline, a stack of compatible tools, and a community with a stake in the ecosystem.

The wager is that open AI ends up like open-source software before it, Linux, Android, the web, where openness builds network effects that produce dominant platforms. If Llama becomes the default foundation developers reach for, Meta steers the technology's direction without metering every API call.

Source trail

Primary references to keep this briefing grounded

AI and automation information changes quickly. Use these official or primary references to verify the claims, pricing, product behaviour, and compliance details before committing budget or production data.

Meta Llama documentation

What to do next

Write the job-to-be-done before looking at another product.
Score each shortlisted tool for workflow fit, data handling, cost, and owner readiness.
Run one small pilot and remove anything the team does not use weekly.

Want help applying this? Explore the AI tools directory.

AI Kick Start is an Illawarra-based AI studio in Figtree, helping businesses across Wollongong, Shellharbour and Kiama and right across Australia put AI to work.

Explore with AI

Use the article as a decision prompt

Summarise this AI Kick Start article for an Australian business owner. Focus on the useful decision, the risks, and the first practical next step: Llama 4: Meta's Mixture-of-Experts Open Model and the Open AI Strategy

Read with ChatGPT Open Claude Search with AI Mode

Turn this into a practical roadmap.

Use the guide as a starting point, then map the first workflow worth building.

Book an AI strategy call