Analysis
If you've tried to buy serious AI compute this year, you already know the punchline: there isn't enough of it, and money alone won't fix that. The most expensive chip NVIDIA has ever shipped is also the one you can't get your hands on, and the wait is measured in seasons, not weeks.
That's the strange shape of the current AI boom. Everyone talks about smarter models and better data, but the thing actually rationing progress is a manufacturing step most people have never heard of, happening in a handful of buildings in Taiwan. When a cloud provider quotes you a 9-to-12-month lead time on Blackwell hardware, that's the bottleneck talking.
For an Australian business team, this matters even if you never touch a GPU directly. It's why your AI vendor's prices keep drifting up, why "we're capacity-constrained" has become a standard line, and why the gap between the labs that can scale and the ones that can't is widening. Here's what's behind it, and when it might let up.
The single biggest constraint in AI right now isn't algorithms, data, or talent. It's hardware, specifically NVIDIA's ability to build enough GPUs to satisfy the labs, cloud providers, and enterprises pouring money into AI infrastructure. Six months on from the Blackwell B200's wider rollout, that constraint hasn't loosened.
The B200 was a genuine step up in AI training and inference. NVIDIA's Blackwell architecture packs 208 billion transistors, a new FP4 precision format, and a fifth-generation NVLink interconnect (Wccftech). NVIDIA's headline numbers put it at roughly 4x the training performance of the Hopper H100 it replaces, with much larger gains claimed on inference (Tom's Hardware). Demand was immediate, and it was huge.
The Nature of the Shortage
This isn't simply a case of TSMC not starting enough wafers. The binding constraint is advanced packaging, TSMC's Chip-on-Wafer-on-Substrate (CoWoS) process, which fuses multiple GPU dies into the large packages that drive today's biggest training clusters.
CoWoS capacity has been growing, but demand has outrun it. TSMC plans to roughly triple its CoWoS output by the end of 2026 (TrendForce), but that means new cleanrooms, specialised tooling, and workers with skills that are hard to find. Building out advanced packaging capacity reportedly takes somewhere in the range of 18-24 months, which means the capacity coming online now was committed back in early 2025.
NVIDIA has tried to route around the problem with variants built on different packaging. The B200A, reportedly unveiled in 2024 and aimed at OEM customers, uses a simpler packaging approach (CoWoS-S rather than CoWoS-L) that trades some interconnect bandwidth for better availability (TrendForce). It isn't a stand-in for the full B200 in the largest clusters, though, where interconnect bandwidth is what sets the ceiling on performance.

Impact on AI Development
The shortage ripples outward. The big labs, names like OpenAI, Google, Anthropic and Meta, have locked in long-term supply deals that give them first call on limited production. Major buyers including Microsoft, Google, Meta and Amazon are documented placing multi-billion-dollar forward orders that soak up most of the allocation (Spheron). The reported prepayment-and-volume structure of those deals is something smaller players simply can't match.
So you get a two-tier market. Well-funded labs keep scaling their training runs, just with longer waits for new clusters. Smaller labs, startups, and academic researchers face 9-12 month waits for meaningful GPU allocations, data-centre GPU lead times have been reported at 36 to 52 weeks (Inworld), which pushes them onto cloud providers where spot availability is patchy and reserved instances mean long-term contracts.
Chinese labs have reportedly been squeezed by both the packaging shortage and US export controls. The most capable Blackwell parts remain restricted from sale to China; the chip that's actually been cleared for export is the Hopper H200, under tight conditions and a surcharge, while a China-specific Blackwell variant is reportedly still in the works (Tom's Hardware).
The Cost Impact
Scarcity shows up in the price. Cloud providers have reportedly raised prices on Blackwell-based instances, figures around 20-35% above initial announcements have circulated, though that specific range isn't pinned to a confirmed source, citing "market conditions." On the secondary market, individual B200 GPUs are rumoured to have changed hands at premiums of 200-300% over list, an unconfirmed figure, though NVIDIA has tried to curb resale through contract terms.
For enterprises building their own AI infrastructure, the climb is real. By way of illustration, a training cluster that might have run about $10 million in early 2025 could now cost in the $14-16 million range for equivalent Blackwell capacity, an indicative example rather than a sourced figure. That kind of pressure is steering teams toward other options: distilling models to cut training compute, quantisation and pruning to fit models on smaller hardware, and software tuning to wring more throughput out of the GPUs they already own.
When Will Relief Come?
TSMC has guided that CoWoS capacity will roughly triple by the end of 2026, with the biggest jumps in the back half of the year (TrendForce). NVIDIA has signalled it expects supply and demand to come into better balance in "late 2026," without getting more specific.
A few things could speed that up or slow it down. On the upside, some of TSMC's expansion is reportedly running ahead of schedule, and NVIDIA's packaging diversification is starting to pay off. On the downside, any fresh surge in demand, a major new model, or a jump in agentic AI that eats more inference compute, could swallow the new capacity as fast as it arrives.


