Claude Opus 4.7 token usage went up for reasons Anthropic documented in its own launch materials. The list price stayed at $5 per million input tokens and $25 per million output tokens, but Anthropic says the same input can now map to about 1.0-1.35x more tokens, and the model may also generate more output because it “thinks more” in higher-effort settings.
That combination is the whole story. Complaints about Claude Opus 4.7 pricing often sound like a generic price hike story; the documents point to something narrower and more annoying. Price per token is stable. Tokens per task are not.
A tokenizer is the system that chops text into the units a model bills and reasons over. Change the tokenizer, and the exact same prompt can become more or fewer billable tokens before the model has done any extra work. Add a model that spends more time reasoning on later turns, and now both the input side and output side can expand.
What Anthropic Actually Changed in Claude Opus 4.7
Anthropic’s launch post says there are two token-usage changes to plan for.
The first is the updated tokenizer. Anthropic states that “the same input can map to more tokens, roughly 1.0-1.35× depending on the content type.” That is a direct admission that some prompts will cost more before you even get to model behavior.
The second is behavioral. Anthropic says Opus 4.7 “thinks more at higher effort levels, particularly on later turns in agentic settings,” which means more output tokens. Its product page describes this as adaptive thinking for “complex agentic workflows.”
That makes Claude Opus 4.7 token usage a two-multiplier problem:
| Cost driver | What changed | Anthropic’s evidence | Billing effect |
|---|---|---|---|
| Input tokenization | New tokenizer | Same input can become 1.0-1.35x more tokens | Higher input token count |
| Output generation | More thinking at higher effort, later turns | More output in harder agentic workflows | Higher output token count |
Anthropic does not publish, in the cited materials, a workload-wide before/after study showing how these two effects combine across real production tasks. That missing dataset is where most of the argument lives.
Why the Same Prompt Can Cost More in Opus 4.7
If two models charge the same per-token rate, you would expect similar work to cost about the same. That expectation only holds if token counts are comparable.
A simple example shows why this breaks.
| Scenario | Opus 4.6 | Opus 4.7 with tokenizer only | Opus 4.7 with tokenizer + more output |
|---|---|---|---|
| Input tokens | 10,000 | 13,500 | 13,500 |
| Output tokens | 2,000 | 2,000 | 4,000 |
| Input cost @ $5/M | $0.0500 | $0.0675 | $0.0675 |
| Output cost @ $25/M | $0.0500 | $0.0500 | $0.1000 |
| Total | $0.1000 | $0.1175 | $0.1675 |
Same nominal pricing. Very different per-task bill.
The tokenizer alone would raise total cost by 17.5% in this example, because output is unchanged and output tokens are more expensive. Add doubled output from higher-effort reasoning and the same task now costs 67.5% more.
That is why “up to 35% more input tokens” is not the same as “up to 35% higher bill.” The answer depends on the input/output mix. Coding agents and long-running tool loops tend to be especially exposed because they accumulate both larger prompts and more generated reasoning over time. If you have been tracking Claude Code regression, this is the kind of interaction developers actually notice: sessions filling faster, responses getting longer, quotas disappearing sooner.
Is This Tokenizer Inflation or Real Shrinkflation?
The strongest verified claim is modest: Anthropic changed tokenization, and Anthropic says that can increase token counts by up to 35% for the same input.
The stronger public claim is that effective costs rose much more than that. Some secondary analyses push in that direction. CloudZero and TokenCost both frame the issue as unchanged list pricing but higher effective cost per task. A practitioner write-up from ClaudeCodeCamp reports measurements using Anthropic’s count_tokens endpoint to compare tokenization across versions.
That is useful, but it still leaves a gap.
What is verified:
– Anthropic documented the tokenizer change.
– Anthropic documented that Opus 4.7 may produce more output in higher-effort, later-turn agentic use.
– Anthropic kept the per-token price unchanged.
What is plausible but not fully proven by primary sources:
– That many real workloads see increases well above the tokenizer range once output growth is included.
– That improved task success offsets the extra token spend often enough to cancel the cost increase.
“Shrinkflation” is the accusation because developers experience fewer useful turns within the same budget or context window. That part is intuitive. If text tokenizes less efficiently, a fixed context window behaves like a smaller one in practice. But Anthropic also claims capability gains, so the clean version is: the billable unit changed, and the public docs do not quantify the net effect per workload.
For readers comparing model families, that matters as much as benchmarks. A model can look stronger than peers and still be materially less efficient on your prompts. That is part of what makes comparisons like GLM-5 vs Claude Opus harder than a benchmark chart suggests.
What Developers Should Watch in Agentic Workflows
The biggest jumps in Claude Opus 4.7 token usage should show up where both multipliers stack.
High-risk workloads:
– Multi-turn coding agents that keep replaying large transcripts
– Long document workflows where retrieval chunks are already expensive to stuff into context
– Higher-effort settings that encourage extra reasoning tokens
– Late-turn sessions where the transcript is large and the model is more likely to “think more”
Lower-risk workloads:
– Short one-shot prompts
– Tasks with tiny outputs
– Pipelines that aggressively summarize or trim conversation state
Three metrics matter more than list price:
1. Tokens per successful task
2. Average output tokens by turn number
3. Context growth rate over long sessions
If those move against you, unchanged Claude Opus 4.7 pricing will not help much.
Anthropic’s docs are at least unusually explicit that something changed. What they do not provide is the number most teams actually need: before-and-after cost by workload class. That is the missing table.
For teams already dealing with version churn, that uncertainty is familiar; Claude Opus 4.6 support showed how quickly the practical behavior of a model line can shift without looking like a formal pricing event.
Key Takeaways
- Anthropic says Claude Opus 4.7 token usage can rise for two separate reasons: a new tokenizer and more output from higher-effort reasoning.
- The new tokenizer can turn the same input into 1.0-1.35x more tokens, according to Anthropic’s launch post.
- Per-token pricing stayed at $5/M input and $25/M output, so the issue is effective per-task cost, not headline list price.
- Agentic workflows are most exposed because they combine long transcripts, repeated context replay, and more generated output on later turns.
- Anthropic has not published a broad workload-level study showing whether capability gains offset the extra token spend in common production use.
Further Reading
- Introducing Claude Opus 4.7, Anthropic’s launch post detailing the tokenizer change and higher-effort behavior.
- Claude Opus 4.7 product page, Anthropic’s pricing and product page for Opus 4.7.
- Claude Opus 4.7 Pricing In 2026: What It Actually Costs, Secondary analysis of unchanged list price versus higher effective spend.
- Claude Opus 4.7 pricing: $5/1M, new tokenizer explained, Practical explanation of tokenizer-driven cost changes.
- I Measured Claude 4.7’s New Tokenizer. Here’s What It Costs You., Practitioner measurements using Anthropic’s token counting endpoint.
The open question is not whether usage changed; it is which workloads now cost enough more that a benchmark win stops being the relevant number.
