DeepSeek V4 Pro Is $0.87 Per Million Tokens Now. GPT-5.5 Wants $30. The Math Is Brutal.

Key Takeaways

– V4 Pro output tokens: $0.87/M, permanent as of May 23. That’s 34x cheaper than GPT-5.5’s $30/M and 29x cheaper than Claude Opus 4.7’s $25/M.
– Input tokens run $0.435/M. Cache hits? Just $0.003625/M. The 75% discount everyone thought would expire May 31 is now the floor forever.
– A coding agent costing $180/day on Claude Opus runs $6/day on V4 Pro. Yeah. Six bucks.
– DeepSeek runs on Huawei Ascend 950 chips, not Nvidia H100s. That’s why this pricing has nothing to do with GPU scarcity.
– The catch: you’re sending API traffic through Chinese infrastructure. Real compliance concern. I’ll give you a framework for when it makes sense and when it doesn’t.

—

So here’s what happened. DeepSeek took that 75% promotional discount it was running and, on May 23, just made it permanent. No expiration. No asterisks. The V4 Pro model — 1.6 trillion parameters, 1M-token context window. Now costs $0.87 per million output tokens. Input is $0.435/M. Cache hits?

A laughably low $0.003625/M.

Originally the promo was supposed to end May 31.

They moved it up eight days.

If you’re running AI in production.

Not screwing around, not doing hobby projects, but actually baking this into something that generates invoices. This rewrites your infrastructure budget. Let me show you the numbers most coverage is skipping. And what to actually test this week.

What $0.87/M Actually Means Against GPT-5.5 and Claude Opus

Straight comparison, output token pricing as of May 23:

| Model | Output $/M tokens | Relative cost |
|—|—|—|
| GPT-5.5 | $30.00 | Baseline |
| Claude Opus 4.7 | $25.00 | 0.83x GPT-5.5 |
| DeepSeek V4 Pro | $0.87 | 0.029x GPT-5.5 |

V4 Pro is 34x cheaper than GPT-5.5. 29x cheaper than Claude Opus.

Now put a real workload behind it. A coding agent that reads your codebase, generates changes, runs tests, iterates. Not a toy. A proper agentic pipeline. Eats about 200,000 output tokens per run. Run it 20 times a day. Totally reasonable for autonomous coding.

– GPT-5.5: 4M output tokens/day = $120/day
– Claude Opus: 4M output tokens/day = $100/day
– V4 Pro: 4M output tokens/day = $3.48/day

No typo. Same workload. $120 on OpenAI. $3.48 on DeepSeek.

Over a month: $3,600 versus $104.40.

I work with clients running multi-agent pipelines that burn tokens like it’s going out of style.

Because at $25/M, honestly, the meter never stops. These prices make workloads that were straight-up budget-impossible suddenly worth firing up.

The Angle Everyone’s Getting Wrong

Here’s what bugs me about the coverage. Everyone’s calling this a price war. “DeepSeek undercuts Western AI with huge discount.” Wrong framing.

And it’ll get you in trouble if you build your strategy around it.

DeepSeek isn’t cutting prices as compute got cheap.

They’re cutting prices since they’ve got a supply problem.

ide.com ran the numbers on chip inventory versus demand. With roughly 750,000 Huawei Ascend 950 chips deployed, DeepSeek currently covers about 37% of API demand. Not 100%. Not even half.

They’re deliberately under-serving the market.

So why would a company price below market when they can’t even fulfill current demand?

Simple. In eight months, the constraint goes away. Huawei’s shipping more Ascend 950 supernodes in H2 2026. When that capacity hits, DeepSeek can handle way more traffic. And the developers who’ve already routed their pipelines, hardcoded base URLs, optimized prompts for V4 Pro? They stay.

Even after the next competitor drops prices.

This is customer acquisition subsidized by future compute. DeepSeek’s paying for your habit right now, expecting to collect later when they can actually serve you properly.

Doesn’t mean avoid it.

Means understand what you’re signing up for. Route critical business workflows through DeepSeek. And if they raise prices in Q1 2027 after Huawei silicon ships at scale, you’ll be irritated. Fair enough. But right now, at $0.87/M, you’re getting below-market rates with a below-market guarantee.

Take the discount. Don’t expect it forever.

Three Things That Just Became Obvious to Run

Where it stops being abstract. Three workflows small teams and solo operators can actually spin up now.

Autonomous coding agent. Used to cost $200-400/month on Claude or GPT. At $3.48/day on V4 Pro, you’re at $104/month for the same work. That’s a rounding error in a freelancer’s monthly software budget. Solo dev, you want an agent that reviews PRs, writes tests, refactors old garbage, ships without you hovering? This is the price point where it makes sense.

RAG research pipeline. Got a knowledge base. Client docs, past project notes, internal SOPs? Want an agent querying that daily to surface relevant stuff? Token burn used to be the blocker. At $0.87/M, a small company running 50 queries/day across a 500K-token corpus runs about $3/day. $90/month for a research assistant that never forgets, never calls in sick, never needs a coffee break.

Daily content pipeline. Mediascout-specific but the math applies broadly. Automated system: pull data, summarize, draft post, format for publication. Now runs on V4 Pro for $5-10/day in tokens. Human time saved compounds faster than the token savings, honestly. No more grinding through research manually.

None of this is theoretical.

I have clients running all three right now on more expensive infrastructure. Migration isn’t hard. DeepSeek supports both OpenAI-compatible and Anthropic-compatible API formats. Swap the base URL, update your client initialization, done. No prompt rewrites needed. Side note: their docs are a bit of a mess, but the API itself is clean.

The Honest Framework: When to Use It and When Not To

I’m not gonna pretend the elephant isn’t in the room.

You’re routing API traffic through a Chinese company’s infrastructure. That matters for compliance, data residency, and the reliability guarantees your procurement team loses sleep over. Regulated industry. Healthcare, finance, anything touching HIPAA or SOC 2. This conversation needs your legal team, not a blog post.

For everyone else: the elephant’s real, but often smaller than the anxiety suggests.

Latency is a fair concern. DeepSeek’s servers are geographically closer to Asian users than to North America or Europe. For real-time stuff — yeah, benchmark it. But for overnight batch processing, research pipelines, async content generation? Latency’s irrelevant.

Uptime history? DeepSeek’s had documented interruptions. So has OpenAI. So has everyone at this point. Question is whether your workflow tolerates a 15-minute blip. Most do.

The compliance question for US businesses is murkier.

Current regulations don’t explicitly prohibit Chinese AI APIs for non-regulated tasks. But the political and reputational calculus is shifting. If your clients are watching your tech stack. And some are — using DeepSeek for their deliverables is a conversation worth having proactively.

What I tell clients: internal tooling, cheapest route.

Client-facing or compliance-sensitive work, providers with documented data policies and US legal standing. Price difference doesn’t justify risk on the wrong workload.

Test One Workflow This Week

V4 Pro at $0.87/M output tokens isn’t a promo that ended. It’s the new floor. At least until H2 2026 when Huawei’s chip supply scales and the market adjusts again.

The math’s unambiguous for agentic, token-heavy workloads. Same pipeline: $3,000/month on Claude Opus, $90/month on V4 Pro. That’s not a rounding error. That’s a budget category change.

What I’d do running lean: pick one workflow. The one burning the most tokens. Spend two hours getting it running on DeepSeek. Swap the base URL, validate outputs against your current provider, measure quality differences if any. Calculate the actual savings.

Outputs comparable and savings real? Answer’s clear. Quality noticeably worse on nuanced tasks? Now you know the boundary and can route accordingly.

The price gap isn’t closing. It’s widening as DeepSeek buys developer loyalty at a discount. Time to benchmark is now, while the promotional mindset still applies — even though it technically isn’t a promotion anymore.

Run the numbers. Then run the pipeline.

—

Sources: The Decoder (May 23), Engadget (May 23), Techmeme (May 24), ide.com analysis (May 23), MEXC News (May 24). Pricing confirmed against DeepSeek official API documentation.