Bots Surpassed Humans in 2024: The 51% Traffic Trap

A weird thing about the claim that bots surpassed humans is that most people hear it as a story about fake people. As if half the internet is now bots arguing on Reddit, posting slop, and liking each other’s comments.

That is not what the 51% number means. Imperva/Thales’s 2025 Bad Bot Report said automated traffic made up 51% of all web traffic in 2024, versus 49% from humans, and 37% of total traffic was classified as bad bots, up from 32% in 2023. Search crawlers, scrapers, monitoring tools, API probes, price bots, and attack traffic all get counted in that same mix.

Here’s the update that makes the headline more interesting, not less: by 2025, Human Security was already breaking AI-driven traffic into training crawlers (67.5%), AI scrapers (31.9%), and agentic AI (1.7%). That is the market moving on from “did bots beat humans?” to the question that actually matters: which automated traffic gets allowed, charged, or blocked.

The core argument is simple: the 51% headline matters because it collapses useful automation, commercial extraction, and malicious abuse into one traffic metric, turning web traffic from an audience signal into a cost-allocation problem.

Did bots surpass humans in internet traffic in 2024?

Yes. By Imperva/Thales’s measurement, bots surpassed humans in internet traffic in 2024.

But wait, what exactly is being measured here? Not “how many fake users exist online.” Not “how many social accounts are bots.” The report is about web requests observed across Imperva’s network, which means it is counting machine hits to websites, apps, and APIs.

A single product page can now get requests from:
– Google indexing it
– an AI scraper copying it for training or search
– a competitor’s price bot checking it every minute
– a monitoring tool verifying uptime
– a credential-stuffing bot trying leaked passwords
– actual humans, maybe

Those all land in traffic totals. That is why bots surpassed humans is both true and easy to misunderstand.

A compact view helps:

Traffic category (2024, Imperva/Thales)	Share of web traffic
Human traffic	49%
Automated traffic	51%
└─ Bad bots	37%
└─ Other automation / good bots	14%

That last line is the whole fight. The 51% figure is not “the robots are here.” It is a single top-line bucket mixing together useful automation, extractive automation, and outright abuse.

Why the 51% figure matters more than the headline

The important part is not that machines “won.” It’s that traffic has stopped being a clean proxy for audience because one metric now hides radically different economic behaviors.

If your analytics spike, was that:
– more readers,
– more search crawler activity,
– more AI-driven traffic,
– more scraping,
– or more attack traffic?

Those are not different shades of the same thing. They are different cost structures pretending to be one number.

Here’s a simple hypothetical. Say a publisher had 10 million monthly page requests in January and 12 million by April, a tidy 20% traffic increase. Sounds healthy. But subscriptions are flat, ad revenue is flat, and hosting costs are up 28%. After segmenting logs, they find the extra 2 million requests came mostly from AI scrapers, aggressive monitoring, and bot probes hitting uncached pages.

That is not growth. It is somebody else’s automation using your infrastructure.

This is why the 51% number matters. Once useful crawling, commercial extraction, and malicious abuse all hit the same servers, “traffic” becomes less of an audience metric and more of a billing dispute.

What the report counts as bots, and what it does not

A lot of people hear “bots” and think fake social-media personas. That category exists, but it is not what this report is mainly about. The report is about automated web traffic.

Imperva splits traffic into broad classes. “Good bots” include search engine crawlers and some useful automation. “Bad bots” include scraping, fraud, account takeover attempts, vulnerability probing, and other abusive behaviors. In Imperva’s count, bad bots alone reached a record 37% of total internet traffic in 2024.

Wait, if search crawlers and attack bots both count as bots, doesn’t that make the 51% number kind of mushy? Yes. Exactly.

The number is real, but it is not morally sorted. It bundles together:
– useful automation that helps the web function
– commercial extraction that copies or monitors without much reciprocity
– malicious abuse that drives everyone’s costs up

A small comparison table makes this clearer than another paragraph of abstraction:

Traffic type	Typical purpose	Value created	Cost to site owner
Human sessions	Read, browse, buy, subscribe	Direct business value	Usually worth serving
Search crawlers	Index pages for discovery	Referral value	Moderate infrastructure cost
AI scrapers / training crawlers	Collect content or product data	Often externalized to the crawler operator	Can create high uncached load with little return
Monitoring / API automation	Uptime checks, integrations	Operational value	Usually predictable and accepted
Bad bots	Fraud, scraping, credential stuffing, probing	Negative value	Security spend, origin load, incident risk

Different vendors will produce different percentages because they see different traffic and define bot categories differently. That does not kill the pattern. It sharpens the real question: which machine load are you paying for, and what do you get back?

Why 2024 was the tipping point for bad bots

The cleanest progression in the data is this: bad bots rose from 32% of traffic in 2023 to 37% in 2024, AI made them cheaper to adapt, trusted crawler identities became camouflage, and operators started writing access policy by bot type instead of pretending all automation was one category.

First, the jump itself. Five points in one year is huge. At internet scale, that means a lot more scraping, probing, fraud, and attack traffic hitting applications that still often report “traffic” as if it implies demand.

Second, generative AI changed the workflow more than the existence of automation. Bots were already scraping pages, rotating IPs, testing leaked credentials, and hitting APIs long before ChatGPT. What AI changes is iteration cost. It helps operators rewrite scripts after blocks fail, generate more realistic request patterns, produce convincing spam or form fills, and keep adjusting faster than simple defenses can keep up.

According to coverage of the Imperva findings, the company observed roughly 2 million AI-enabled attacks per day in 2024. That does not mean every attack was some autonomous super-agent. It means the boring parts of bot operations got cheaper.

Then there is the stranger detail. Reporting around the data said ByteSpider, the crawler associated with ByteDance, accounted for 54% of AI-enabled attacks seen by Imperva. The revealing part is not a cartoon villain story. It is that known crawler identities can become useful cover. If a name is widely recognized, attackers spoof it. Reputation becomes attack surface.

By 2025 and into 2026, the classification got more explicit: training crawlers (67.5%), AI scrapers (31.9%), agentic AI (1.7%). Those categories are not taxonomy; they are the beginnings of a pricing and permission system for the web.

A training crawler raises one policy question: can it ingest your content at all? An AI scraper raises another: does it pay if it extracts value at scale? Agentic traffic raises a third: what actions can an automated system take on your site before it needs identity, rate limits, or a contract?

That is the payoff. The web is moving toward three tiers for automation:
– Allow: indexing, verified integrations, predictable monitoring
– Charge: high-volume extraction, premium API access, expensive dynamic endpoints
– Block: fraud, credential stuffing, abusive scraping, spoofed crawlers

What readers should take from the bot traffic shift

If you run a site, app, store, or API, the move now is not “get serious about bots” in the abstract. It is to classify traffic by economic behavior and tie each class to an action.

Here’s a concrete operator checklist:

Metric to watch	What it tells you	Trigger	Action
Verified-human sessions	Real audience	Requests rise while verified-human sessions stay flat	Treat the increase as suspect until segmented
Conversion rate	Business value from traffic	Traffic up, conversions flat or down	Check for scraper or low-intent bot inflation
Uncached origin requests	Expensive infrastructure load	Origin hits rise faster than pageviews	Rate-limit heavy paths, cache harder, gate expensive endpoints
WAF challenge rate	Active bot pressure	Sharp rise in challenges or failed challenges	Tighten bot rules, fingerprint repeat offenders, protect login/search/API routes
Top scraper user agents and ASN/IP clusters	Who is extracting data	New crawler identities spike or “known” crawlers behave oddly	Verify, throttle, require robots compliance or block
Cost per 1,000 requests by traffic class	Who is making you pay	One traffic class has high marginal cost and low value	Move it to paid API, metered access, or denial

That last metric is the one more teams should compute. Not just traffic share. Cost per 1,000 requests by traffic class. Once you have that, the policy gets much easier.

For publishers, the practical shift is from SEO policy to access policy. Decide which crawlers can index, which can train, which can scrape article archives, and which get cut off. The AI content feedback loop is one version of the same problem: machine systems consuming content, reproducing it elsewhere, and sending back less value than they take.

For platforms, the visible bot problem is only one layer. Fake accounts matter, and our piece on How Many Bots Are on X (Twitter)? gets into that. But underneath the feed is the bigger infrastructure shift: more of the web’s activity is systems talking to systems before any human shows up.

For ordinary users, the change is quieter. More of what looks like “internet activity” is now fetching, indexing, copying, ranking, probing, and acting by automation. The web is still made for people. But a growing share of its traffic economy is machine-to-machine negotiation over who gets access, who gets blocked, and who picks up the bill.

Key Takeaways

Imperva/Thales said automated traffic reached 51% of all web traffic in 2024, so by that measure bots surpassed humans.
That headline is easy to misread because it mixes useful automation, commercial extraction, and malicious abuse into one traffic bucket.
The real shift is economic: web traffic is becoming less of an audience signal and more of a cost-allocation problem.
Generative AI matters mostly because it lowers the cost of adapting, disguising, and scaling bot operations.
By 2025/2026, firms were already segmenting AI-driven traffic into training crawlers, AI scrapers, and agentic AI, which is really a blueprint for allow / charge / block policy.