Claude Opus 4.7 Token Usage Jumped 35% Without Price Hikes

Claude Opus 4.7 token usage went up for reasons Anthropic documented in its own launch materials. The list price stayed at $5 per million input tokens and $25 per million output tokens, but Anthropic says the same input can now map to about 1.0-1.35x more tokens, and the model may also generate more output because it “thinks more” in higher-effort settings.

That combination is the whole story. Complaints about Claude Opus 4.7 pricing often sound like a generic price hike story; the documents point to something narrower and more annoying. Price per token is stable. Tokens per task are not.

A tokenizer is the system that chops text into the units a model bills and reasons over. Change the tokenizer, and the exact same prompt can become more or fewer billable tokens before the model has done any extra work. Add a model that spends more time reasoning on later turns, and now both the input side and output side can expand.

What Anthropic Actually Changed in Claude Opus 4.7

Anthropic’s launch post says there are two token-usage changes to plan for.

The first is the updated tokenizer. Anthropic states that “the same input can map to more tokens, roughly 1.0-1.35× depending on the content type.” That is a direct admission that some prompts will cost more before you even get to model behavior.

The second is behavioral. Anthropic says Opus 4.7 “thinks more at higher effort levels, particularly on later turns in agentic settings,” which means more output tokens. Its product page describes this as adaptive thinking for “complex agentic workflows.”

That makes Claude Opus 4.7 token usage a two-multiplier problem:

Cost driver	What changed	Anthropic’s evidence	Billing effect
Input tokenization	New tokenizer	Same input can become 1.0-1.35x more tokens	Higher input token count
Output generation	More thinking at higher effort, later turns	More output in harder agentic workflows	Higher output token count

Anthropic does not publish, in the cited materials, a workload-wide before/after study showing how these two effects combine across real production tasks. That missing dataset is where most of the argument lives.

Why the Same Prompt Can Cost More in Opus 4.7

If two models charge the same per-token rate, you would expect similar work to cost about the same. That expectation only holds if token counts are comparable.

A simple example shows why this breaks.

Scenario	Opus 4.6	Opus 4.7 with tokenizer only	Opus 4.7 with tokenizer + more output
Input tokens	10,000	13,500	13,500
Output tokens	2,000	2,000	4,000
Input cost @ $5/M	$0.0500	$0.0675	$0.0675
Output cost @ $25/M	$0.0500	$0.0500	$0.1000
Total	$0.1000	$0.1175	$0.1675

Same nominal pricing. Very different per-task bill.

The tokenizer alone would raise total cost by 17.5% in this example, because output is unchanged and output tokens are more expensive. Add doubled output from higher-effort reasoning and the same task now costs 67.5% more.

That is why “up to 35% more input tokens” is not the same as “up to 35% higher bill.” The answer depends on the input/output mix. Coding agents and long-running tool loops tend to be especially exposed because they accumulate both larger prompts and more generated reasoning over time. If you have been tracking Claude Code regression, this is the kind of interaction developers actually notice: sessions filling faster, responses getting longer, quotas disappearing sooner.

Is This Tokenizer Inflation or Real Shrinkflation?

The strongest verified claim is modest: Anthropic changed tokenization, and Anthropic says that can increase token counts by up to 35% for the same input.

The stronger public claim is that effective costs rose much more than that. Some secondary analyses push in that direction. CloudZero and TokenCost both frame the issue as unchanged list pricing but higher effective cost per task. A practitioner write-up from ClaudeCodeCamp reports measurements using Anthropic’s count_tokens endpoint to compare tokenization across versions.

That is useful, but it still leaves a gap.

What is verified:
– Anthropic documented the tokenizer change.
– Anthropic documented that Opus 4.7 may produce more output in higher-effort, later-turn agentic use.
– Anthropic kept the per-token price unchanged.

What is plausible but not fully proven by primary sources:
– That many real workloads see increases well above the tokenizer range once output growth is included.
– That improved task success offsets the extra token spend often enough to cancel the cost increase.

“Shrinkflation” is the accusation because developers experience fewer useful turns within the same budget or context window. That part is intuitive. If text tokenizes less efficiently, a fixed context window behaves like a smaller one in practice. But Anthropic also claims capability gains, so the clean version is: the billable unit changed, and the public docs do not quantify the net effect per workload.

For readers comparing model families, that matters as much as benchmarks. A model can look stronger than peers and still be materially less efficient on your prompts. That is part of what makes comparisons like GLM-5 vs Claude Opus harder than a benchmark chart suggests.

What Developers Should Watch in Agentic Workflows

The biggest jumps in Claude Opus 4.7 token usage should show up where both multipliers stack.

High-risk workloads:
– Multi-turn coding agents that keep replaying large transcripts
– Long document workflows where retrieval chunks are already expensive to stuff into context
– Higher-effort settings that encourage extra reasoning tokens
– Late-turn sessions where the transcript is large and the model is more likely to “think more”

Lower-risk workloads:
– Short one-shot prompts
– Tasks with tiny outputs
– Pipelines that aggressively summarize or trim conversation state

Three metrics matter more than list price:
1. Tokens per successful task
2. Average output tokens by turn number
3. Context growth rate over long sessions

If those move against you, unchanged Claude Opus 4.7 pricing will not help much.

Anthropic’s docs are at least unusually explicit that something changed. What they do not provide is the number most teams actually need: before-and-after cost by workload class. That is the missing table.

For teams already dealing with version churn, that uncertainty is familiar; Claude Opus 4.6 support showed how quickly the practical behavior of a model line can shift without looking like a formal pricing event.

Key Takeaways

Anthropic says Claude Opus 4.7 token usage can rise for two separate reasons: a new tokenizer and more output from higher-effort reasoning.
The new tokenizer can turn the same input into 1.0-1.35x more tokens, according to Anthropic’s launch post.
Per-token pricing stayed at $5/M input and $25/M output, so the issue is effective per-task cost, not headline list price.
Agentic workflows are most exposed because they combine long transcripts, repeated context replay, and more generated output on later turns.
Anthropic has not published a broad workload-level study showing whether capability gains offset the extra token spend in common production use.

Claude Opus 4.7 Token Usage Jumped 35% Without Price Hikes

What Anthropic Actually Changed in Claude Opus 4.7

Why the Same Prompt Can Cost More in Opus 4.7

Is This Tokenizer Inflation or Real Shrinkflation?

What Developers Should Watch in Agentic Workflows

Key Takeaways

Further Reading

Cursor leads AI coding agents on workflow

Microsoft packages Foundry Local for on-device apps

Congress Moves to Preempt States; Cyber Models Hit Safety Walls; Cloudflare Absorbs Vite’s Core Team; Huawei Targets Inference Memory Costs

VS Code Token Theft Lands; Soundbar Becomes a Keyboard; Web PKI Starts Moving; Espressif Raises the Floor; Elixir Typing Gets Real

White House Seeks Early Model Access; Adafruit Says Flux Sent Legal Threat; Microsoft Turns Evals Into QA; Gmail AI Gets More Personal

Categories