Analysis of AI crawler traffic trends from April 2026, with a scorecard on the Q1 predictions made last month. Training crawlers crossed 51.5% — past the 50% milestone. Applebot leapfrogged Bingbot to become the #5 AI crawler at 9.1%. Googlebot dropped below 30% for the first time. Bytespider almost doubled. Three Q1 predictions hit, one missed. Complete breakdown of crawler market share, industry targeting, robots.txt blocking patterns, and Workers AI model shifts based on Cloudflare Radar data.
This report is updated monthly with fresh Cloudflare Radar data. Bookmark this page to track how AI crawlers are reshaping web traffic each month.
Three of the four predictions I made in last month's Q1 review have already been confirmed by April data — and one was decisively wrong. Googlebot dropped below 30% for the first time (29.96%). Applebot kept surging, leapfrogged Bingbot, and is now the fifth-largest AI crawler at 9.1%. And training crawlers crossed the 50% threshold to reach 51.5% of all AI bot traffic. The miss: Meta-ExternalAgent didn't plateau at 18-20% as expected — it actually fell from 16.7% to 14.5%.
I analyzed 30 days of Cloudflare Radar AI Insights data covering April 4 through May 4, 2026, and the results confirm what the Q1 review predicted: the diversification of the AI crawler market is accelerating, not stabilizing. Two new crawlers entered the top tier (TikTokSpider and Anthropic's dedicated Claude-SearchBot), Bytespider nearly doubled, and the Workers AI model leaderboard saw three brand-new entrants. If you want the foundational primer on the pipelines behind this data, our explainer on how search engines really work walks through crawl budgets, inverted indexes, and learning-to-rank — all the systems these bots are feeding.
📊 Stats Alert: Dedicated AI training crawlers now generate 51.5% of all AI bot traffic — the first month past the 50% milestone. Applebot reached 9.1% to overtake Bingbot for #5. Googlebot fell below 30% for the first time. Bytespider grew +72% month-over-month.
Googlebot remains the largest AI-related crawler globally, but for the first time in Cloudflare Radar's tracking history, its share dipped below 30% — landing at 29.96%. Meta-ExternalAgent stays in second at 14.5%, but down significantly from March's 16.7%. ClaudeBot held essentially flat at 11.6%, narrowly overtaking GPTBot which slipped to 10.0%. The biggest move in April: Applebot leapfrogged Bingbot to become the #5 AI crawler at 9.1%. And Bytespider had a major resurgence, jumping from 3.6% to 6.2% to take the #7 spot.
Two new crawlers also broke into the top 15 this month: TikTokSpider (1.07%) — ByteDance's second crawler beyond Bytespider — and Claude-SearchBot (0.79%), Anthropic's newly observed dedicated search crawler distinct from ClaudeBot. Anthropic now operates two separate crawlers in the wild, one for training and one specifically for Claude's web search functionality.
| AI Bot | April 2026 Share (%) | Operator | Primary Purpose |
|---|---|---|---|
| Googlebot | 30.0% | Search indexing + AI training (mixed) | |
| Meta-ExternalAgent | 14.5% | Meta | AI training |
| ClaudeBot | 11.6% | Anthropic | Model training |
| GPTBot | 10.0% | OpenAI | Model training |
| Applebot | 9.1% | Apple | Search + AI features |
| Bingbot | 8.2% | Microsoft | Search indexing + AI (mixed) |
| Bytespider | 6.2% | ByteDance | AI training |
| Amazonbot | 4.7% | Amazon | AI training |
| OAI-SearchBot | 1.8% | OpenAI | ChatGPT Search |
| ChatGPT-User | 1.1% | OpenAI | Real-time user queries |
| TikTokSpider | 1.1% | ByteDance | AI training |
| Claude-SearchBot | 0.8% | Anthropic | Claude web search |
The concentration of crawling power among the traditional top five companies — Google, Meta, OpenAI, Anthropic, and Microsoft — stands at 74.3% in April (counting each company's primary crawler), down from 80.2% in March, 82.8% in February, and 84.5% in January. This is the fourth consecutive monthly decline, and the steepest single-month drop of the year. Apple is no longer the only diversification story: ByteDance now operates two crawlers (Bytespider + TikTokSpider) totaling 7.3%, putting it firmly in the top tier alongside Apple. Even adding Applebot (9.1%) and Bytespider (6.2%) to the traditional top five only reaches 89.6% for the top seven — meaningful "long tail" growth in just a quarter.
If you're managing AI bot access on your site, the list of operators you need to account for in your robots.txt got longer again this month — TikTokSpider, Claude-SearchBot, and the now-resurgent Bytespider all warrant explicit policies. For context on how crawling translates into actual referral traffic back to websites, check our companion Search Engine Referral Report on crawl-to-refer ratios.
The headline shift in April is the simultaneous decline of all three "big AI labs" crawlers — Meta-ExternalAgent (-2.2 pp), GPTBot (-2.0 pp), and Googlebot (-1.6 pp) all lost share in the same month for the first time this year. The traffic that vacated didn't go to a new dominant crawler; it dispersed across Applebot (+3.3 pp), Bytespider (+2.6 pp), and the emerging long tail (TikTokSpider, Claude-SearchBot, ChatGPT-User).
| AI Bot | March 2026 Share | April 2026 Share | Change (pp) | Relative Change |
|---|---|---|---|---|
| Googlebot | 31.6% | 30.0% | -1.6 | -5.2% |
| Meta-ExternalAgent | 16.7% | 14.5% | -2.2 | -13.1% |
| GPTBot | 12.0% | 10.0% | -2.0 | -16.5% |
| ClaudeBot | 11.7% | 11.6% | -0.1 | -0.6% |
| Bingbot | 8.2% | 8.2% | 0.0 | 0.0% |
| Applebot | 5.8% | 9.1% | +3.3 | +56.3% |
| Bytespider | 3.6% | 6.2% | +2.6 | +72.6% |
| Amazonbot | 4.4% | 4.7% | +0.3 | +6.0% |
| OAI-SearchBot | 2.2% | 1.8% | -0.4 | -17.2% |
| ChatGPT-User | — | 1.1% | New | New entrant |
| TikTokSpider | — | 1.1% | New | New entrant |
| Claude-SearchBot | — | 0.8% | New | New entrant |
Six trends stand out from this comparison:
Applebot leapfrogged Bingbot. Applebot's surge from 5.8% to 9.1% (+3.3 pp, +56% relative) makes it the fifth-largest AI crawler globally — passing Microsoft's Bingbot for the first time. This was the central prediction of last month's quarterly review (we projected this could happen by May or June), and it happened in April. At the current absolute share of 9.1%, Applebot is now generating more crawl traffic than ClaudeBot and GPTBot combined a year ago.
Googlebot finally dropped below 30%. The Q1 review predicted Googlebot would breach 30% in Q2 — it happened in the first month of Q2. At 29.96%, Googlebot has now lost over 8.7 percentage points since January. The decline rate slowed (-1.6 pp in April vs -4.0 pp in March), suggesting Google may be approaching a floor as competitors saturate.
Meta-ExternalAgent reversed direction. After three consecutive monthly gains in Q1, Meta-ExternalAgent fell -2.2 pp in April to 14.5% — its first decline since tracking began. The Q1 review predicted a plateau at 18-20%; the actual outcome was a contraction. This is the single biggest miss in last month's predictions and warrants explanation: Meta may have completed a major training data refresh, or shifted crawling to a different unidentified user agent.
Bytespider had a major resurgence. ByteDance's crawler nearly doubled from 3.6% to 6.2% (+72% relative). Combined with the new TikTokSpider at 1.1%, ByteDance's crawler footprint reached 7.3% in April — making ByteDance the third-largest AI crawler operator globally after Google and Meta, ahead of OpenAI's combined footprint.
OpenAI's combined footprint declined. GPTBot (10.0%), OAI-SearchBot (1.8%), and ChatGPT-User (1.1%) sum to 12.9% — down from 14.2% in March (GPTBot 12.0% + OAI-SearchBot 2.2%). Even with the new ChatGPT-User entrant breaking out, OpenAI's overall share contracted. If you're building apps that rely on real-time retrieval, understanding what a web search API is and how these bots work under the hood matters more than ever.
Anthropic split into two crawlers. ClaudeBot held essentially flat at 11.6%, but the new Claude-SearchBot (0.8%) appeared as a distinct user agent for the first time. This mirrors OpenAI's split between GPTBot (training) and OAI-SearchBot (search) — Anthropic now operates a clear training/search separation, giving website owners the ability to allow one without the other.
The 50% threshold is in the rearview mirror. In April 2026, dedicated AI training crawlers reached 51.5% of all AI bot traffic — a clear majority for the first time, and a +1.6 pp gain over March. Mixed-purpose crawlers (the Googlebot/Bingbot bundle that does both search indexing and training) continued to slide, reaching 38.2%.
| Crawl Purpose | April 2026 Share | March 2026 Share | Change (pp) |
|---|---|---|---|
| Training | 51.5% | 49.9% | +1.6 |
| Mixed Purpose | 38.2% | 39.9% | -1.7 |
| Search | 7.6% | 7.7% | -0.1 |
| User Action | 2.2% | 2.1% | +0.1 |
| Undeclared | 0.4% | 0.4% | 0.0 |
Here's what the April data tells website owners:
Training crossed the majority threshold at 51.5%. The +1.6 pp gain is more modest than March's +6.0 pp surge, but it's the seventh consecutive monthly increase. The deceleration suggests the Q1 acceleration phase has ended and Training is settling into a steadier growth pattern. For every 100 AI bot requests hitting your site, 52 are now explicitly dedicated to training — meaning robots.txt rules targeting training crawlers (GPTBot, ClaudeBot, Meta-ExternalAgent, Applebot, Bytespider) cover the majority of AI traffic, not a fraction.
Mixed Purpose continued its slide to 38.2%. Crawlers that simultaneously index for search and collect training data have lost ground for four consecutive months — from 48.3% in January to 38.2% in April, a cumulative -10.1 pp drop. The decline mirrors Googlebot's -1.6 pp and matches the trajectory predicted in the Q1 review. You still can't separate the AI training from the search indexing with these crawlers — block Googlebot and you disappear from Google Search. For developers looking at how to ground AI responses with Google Search alternatives, this bundling problem keeps coming up.
Search crawling stabilized at 7.6%. After declining in March, AI search crawlers held essentially flat in April. OAI-SearchBot fell to 1.8% but was offset by the new Claude-SearchBot at 0.8%. The growing ecosystem of AI search API alternatives continues to drive demand for these retrieval systems.
User Action ticked up to 2.2%. ChatGPT-User now shows up explicitly in the top crawler list at 1.1%, accounting for the slight uptick in real-time user-query bots. This is the category most directly tied to ChatGPT's "browse" feature.
The 13.3-percentage-point gap between Training (51.5%) and Mixed Purpose (38.2%) is a decisive structural shift. Training crawlers aren't just the majority — they're pulling away. At the current trajectory, Training could reach 55% by mid-Q2 2026, while Mixed Purpose may fall below 35% by June.
AI crawlers showed remarkable consistency in which industries they targeted in April. Retail and e-commerce sites continue to absorb the largest share of AI crawling activity globally — but the standout April change is Internet and Telecom jumping from 16.9% to 18.3% (+1.4 pp), the largest single-month gain across any vertical this year.
| Industry Vertical | April 2026 Share | March 2026 Share | Change (pp) |
|---|---|---|---|
| Shopping & General Merchandise | 31.7% | 31.7% | 0.0 |
| Internet and Telecom | 18.3% | 16.9% | +1.4 |
| Computer and Electronics | 14.2% | 14.4% | -0.2 |
| News, Media, and Publications | 8.7% | 8.9% | -0.2 |
| Business and Industry | 4.9% | 5.0% | -0.1 |
| Travel and Tourism | 3.8% | 4.0% | -0.2 |
| Professional Services | 3.2% | 3.3% | -0.1 |
| Finance | 2.8% | 2.9% | -0.1 |
| Gambling | 2.4% | 2.9% | -0.5 |
| Career and Education | 2.1% | — | New top 10 |
Shopping held at exactly 31.7% — retail has now stayed above 31% for four consecutive months, confirming that e-commerce content is the single most crawled category on the web. The notable Internet and Telecom rise (+1.4 pp) tracks closely with Applebot's surge — Apple appears to be heavily targeting telecom and SaaS content as it builds out Apple Intelligence. Career and Education entered the top 10 verticals at 2.1%, displacing Gambling.
The industry-level breakdown gets more granular:
| Industry | April 2026 Share | March 2026 Share | Change (pp) |
|---|---|---|---|
| Retail | 28.7% | 28.6% | +0.1 |
| Computer Software | 13.0% | 13.2% | -0.2 |
| IT and Services | 5.9% | 5.8% | +0.1 |
| Internet | 5.2% | 5.1% | +0.1 |
| Telecommunications | 3.8% | 2.8% | +1.0 |
| Adult Entertainment | 2.6% | 2.7% | -0.1 |
| Online Media | 2.6% | 2.6% | 0.0 |
| Media | 2.4% | 2.5% | -0.1 |
| Marketing and Advertising | 2.3% | — | New top 10 |
| Gambling & Casinos | 2.3% | 2.7% | -0.4 |
The most notable industry-level change is Telecommunications growing from 2.8% to 3.8% (+1.0 pp). This is the largest single-month gain in any industry category in the past quarter, and it correlates almost exactly with Applebot's surge. Marketing and Advertising also entered the top 10 industries at 2.3%, replacing Hospitality. If you maintain developer documentation, API references, or technical tutorials, this is why choosing the right AI web search API for your applications matters — these crawlers are the infrastructure behind the search results your users see.
AI training crawlers aren't the only bots scanning the web at this scale, either. Technology detection platforms like Technologychecker.io crawl and fingerprint over 50 million domains using HTTP header analysis, JavaScript fingerprinting, DNS lookups, and headless browser rendering to identify 40,000+ technologies. Unlike AI training crawlers that take content for model weights, technology intelligence crawlers need to re-crawl frequently to track stack changes and new tech adoptions.
⚠️ Methodology note: Cloudflare Radar's robots.txt parsing corpus was refreshed in early Q2 2026, which materially changed the share denominator used for ranking AI crawler directives. The March/February/January percentages reported in last month's edition reflect the prior, larger sample and are preserved unchanged below for historical accuracy. April 2026 percentages use the new sample (4,102 parsed robots.txt files at the most recent snapshot). Because the samples differ, direct percentage-to-percentage comparisons between April and earlier months should be treated as directional, not arithmetic. Relative ranks remain comparable.
| AI Crawler | Domains Referencing | % of Parsed Files | Operator |
|---|---|---|---|
| GPTBot | 597 | 14.55% | OpenAI |
| ClaudeBot | 504 | 12.29% | Anthropic |
| CCBot | 493 | 12.02% | Common Crawl |
| Google-Extended | 476 | 11.60% | |
| Bytespider | 410 | 9.99% | ByteDance |
| Googlebot | 398 | 9.70% | |
| meta-externalagent | 368 | 8.97% | Meta |
| PerplexityBot | 358 | 8.73% | Perplexity |
| Amazonbot | 345 | 8.41% | Amazon |
| Applebot-Extended | 338 | 8.24% | Apple |
| ChatGPT-User | 322 | 7.85% | OpenAI |
| OAI-SearchBot | 270 | 6.58% | OpenAI |
| anthropic-ai | 245 | 5.97% | Anthropic |
The April rankings preserve March's top four order — GPTBot, ClaudeBot, CCBot, Google-Extended — but the gap between #1 and #5 narrowed. PerplexityBot climbed to #8 with 358 domains referencing it (8.73% of parsed files), up two positions from March. Applebot-Extended also jumped into the top 10 at 8.24% — a meaningful response to Applebot's traffic surge: as more website owners notice Applebot in their logs, they're adding the Applebot-Extended directive (which controls AI training opt-out independently of search visibility).
Notably, OpenAI now occupies three slots in the top 12: GPTBot (#1), ChatGPT-User (#11), and OAI-SearchBot (#12). No other operator has three crawlers explicitly referenced this often.
The meta-externalagent gap persists. Despite being the #2 AI crawler by traffic at 14.5%, Meta's bot only ranks #7 in robots.txt references — still one of the largest disparities between traffic share and blocking rate among major crawlers.
For continuity, the previously published Q1 2026 robots.txt percentages remain as historical anchors:
| AI Crawler | January 2026 | February 2026 | March 2026 | April 2026 (new methodology) |
|---|---|---|---|---|
| GPTBot | 5.29% | 5.40% | 5.52% | 14.55%* |
| ClaudeBot | 4.33% | 4.57% | 4.72% | 12.29%* |
| CCBot | 4.40% | 4.64% | 4.53% | 12.02%* |
| Google-Extended | 4.04% | 4.38% | 4.37% | 11.60%* |
| Bytespider | 3.69% | 3.70% | 3.67% | 9.99%* |
| Amazonbot | 2.99% | 3.12% | 3.29% | 8.41%* |
*April 2026 percentages reflect a refreshed Cloudflare Radar corpus with a smaller sample of parsed robots.txt files. Use these to compare April crawlers to one another, not to the January–March series.
Despite the methodology shift, two qualitative observations hold across both samples:
Beyond crawlers, Cloudflare Radar tracks usage patterns on Cloudflare Workers AI — the platform that lets developers run AI models at the edge. The April 2026 data reveals the most dramatic single-month reshuffling of the model leaderboard since tracking began: three brand-new models entered the top 11, Llama 3 8B's share dropped 8.5 pp in a single month, and OpenAI's GPT-OSS 120B more than doubled.
| Model | April 2026 Share | March 2026 Share | Change (pp) | Developer |
|---|---|---|---|---|
| Llama 3 8B Instruct | 28.8% | 37.3% | -8.5 | Meta |
| Stable Diffusion XL Base 1.0 | 9.7% | 12.3% | -2.6 | Stability AI |
| Gemma 4 26B-A4B-IT (NEW) | 6.7% | — | New entrant | |
| Llama 4 Scout 17B | 6.1% | 7.0% | -0.9 | Meta |
| Whisper | 5.8% | 7.5% | -1.7 | OpenAI |
| GPT-OSS 120B | 5.5% | 2.1% | +3.4 | OpenAI |
| M2M-100 1.2B | 3.9% | 5.1% | -1.2 | Meta |
| Llama 3.2 1B Instruct (NEW) | 3.6% | — | New entrant | Meta |
| Llama 3 8B Instruct (AWQ) | 3.5% | 4.4% | -0.9 | Meta |
| FLUX.1 Schnell | 3.2% | 3.0% | +0.2 | Black Forest Labs |
| Kimi K2.6 (NEW) | 2.3% | — | New entrant | Moonshot AI |
The headline trend in April is the collapse of Llama 3 8B's dominance — its share fell from 37.3% to 28.8% in a single month, the largest single-month decline for any Workers AI model on record. The model still leads, but its lead has compressed dramatically: in February it commanded 40.1%; in April, 28.8%. The "other" category (models outside the top 11) reached 20.8%, continuing the long-tail expansion.
Three brand-new models entered the top 11 simultaneously: Google's Gemma 4 26B debuted at 6.7% (immediately #3, the highest debut of any model this year), Meta's lightweight Llama 3.2 1B Instruct at 3.6%, and Moonshot AI's Kimi K2.6 at 2.3% (the first non-US-developed model to crack Workers AI's top 11). These three new entrants together account for 12.6% of all Workers AI accounts.
Meta's overall provider share contracted again. Combining Llama 3 8B (28.8%), Llama 4 Scout (6.1%), M2M-100 (3.9%), Llama 3.2 1B (3.6%), and Llama 3 AWQ (3.5%), Meta models account for 45.9% of all Workers AI accounts — down from 53.8% in March. Meta has lost roughly 12 percentage points of provider share since January. Google, by contrast, is suddenly meaningful at 6.7% (Gemma 4 26B alone) — a category Google barely registered in last month.
GPT-OSS 120B's growth accelerated dramatically, climbing from 2.1% to 5.5% (+3.4 pp, +160% relative). OpenAI's open-source 120B model is the breakout adoption story of Q2 — it's now the #6 model on Workers AI, having ranked #8 just one month ago. Combined with Whisper (5.8%) and OpenAI's other models, OpenAI's Workers AI footprint reaches roughly 11-12%, holding its #2 provider position.
| Task Type | April 2026 Share | March 2026 Share | Change (pp) |
|---|---|---|---|
| Text Generation | 69.5% | 64.2% | +5.3 |
| Text-to-Image | 16.9% | 19.3% | -2.4 |
| Automatic Speech Recognition | 7.5% | 9.3% | -1.8 |
| Translation | 3.9% | 5.1% | -1.2 |
| Text-to-Speech | 0.8% | 0.5% | +0.3 |
| Text Classification | 0.7% | 0.8% | -0.1 |
| Image-to-Text | 0.5% | 0.7% | -0.2 |
Text Generation surged to 69.5%, up +5.3 pp from March. This is the largest single-month gain in any task category since tracking began, and it's directly attributable to GPT-OSS 120B's adoption surge plus Gemma 4 26B's debut — both are text-generation workloads. Text-to-Image declined to 16.9%, the lowest share of the year, as Stable Diffusion XL slid from 12.3% to 9.7%.
Text-to-Speech jumped to 0.8% (up from 0.5%), continuing its consistent growth and confirming sustained developer interest in voice synthesis at the edge. While still a small category, this is the second-largest relative increase (+60%) of any task category in April after Text Generation.
Q1 2026 was the most transformative quarter in the history of AI web crawling. This section compiles data from three consecutive monthly analyses to document the structural shifts reshaping how AI companies interact with the open web.
| Metric | January 2026 | February 2026 | March 2026 | Q1 Change |
|---|---|---|---|---|
| Top crawler (Googlebot) | 38.7% | 34.6% | 31.6% | -7.1 pp |
| #2 crawler | GPTBot (12.8%) | Meta-ExternalAgent (15.6%) | Meta-ExternalAgent (16.7%) | Meta took #2 |
| Training crawl share | 42.0% | 45.4% | 49.9% | +7.9 pp |
| Mixed Purpose crawl share | 48.3% | 43.9% | 39.9% | -8.4 pp |
| Top 5 company concentration | 84.5% | 82.8% | 80.2% | -4.3 pp |
| Domains blocking GPTBot | 5.29% | 5.45% | 5.52% | +0.23 pp |
| Llama 3 8B (Workers AI) | 41.7% | 40.1% | 37.3% | -4.4 pp |
| AI Bot | Jan 2026 | Feb 2026 | Mar 2026 | Q1 Change (pp) | Q1 Direction |
|---|---|---|---|---|---|
| Googlebot | 38.7% | 34.6% | 31.6% | -7.1 | Declining |
| Meta-ExternalAgent | 11.6% | 15.6% | 16.7% | +5.1 | Rising |
| GPTBot | 12.8% | 12.1% | 12.0% | -0.8 | Stable/Slow decline |
| ClaudeBot | 11.4% | 11.1% | 11.7% | +0.3 | V-shaped recovery |
| Bingbot | 9.7% | 9.3% | 8.2% | -1.5 | Declining |
| Applebot | 2.5% | 3.1% | 5.8% | +3.3 | Surging |
| Amazonbot | 4.8% | 5.4% | 4.4% | -0.4 | Volatile |
| Bytespider | 3.5% | 3.3% | 3.6% | +0.1 | Flat |
| OAI-SearchBot | 2.0% | 2.6% | 2.2% | +0.2 | Volatile |
Googlebot's sustained decline is the defining trend of Q1. Losing 7.1 pp over three months represents the largest quarterly share loss for any single crawler in Cloudflare Radar's tracking history. This doesn't necessarily mean Google is crawling less — it means competitors are crawling more, faster. At 31.6%, Googlebot is no longer 2x the size of the next-largest crawler; the ratio to Meta-ExternalAgent is now 1.9x and shrinking.
Meta-ExternalAgent was the biggest winner of Q1. Gaining +5.1 pp over three months, Meta's crawler overtook GPTBot in February and never looked back. The growth rate decelerated across the quarter (+3.1 pp → +3.7 pp → +2.3 pp), suggesting Meta may be approaching a near-term equilibrium.
Applebot's March surge was the quarter's biggest surprise. After modest growth in January (+0.2 pp) and February (+0.6 pp), Applebot exploded in March (+3.2 pp, +124% relative). Over Q1, Applebot more than doubled from 2.5% to 5.8%. Apple is now a top-six AI crawler operator — a status no one predicted at the start of the quarter.
The most consequential structural shift of Q1 2026 happened in the crawl purpose data:
| Crawl Purpose | Jan 2026 | Feb 2026 | Mar 2026 | Q1 Change (pp) |
|---|---|---|---|---|
| Training | 42.0% | 45.4% | 49.9% | +7.9 |
| Mixed Purpose | 48.3% | 43.9% | 39.9% | -8.4 |
| Search | 6.9% | 8.2% | 7.7% | +0.8 |
| User Action | 2.2% | 2.0% | 2.1% | -0.1 |
| Undeclared | 0.5% | 0.4% | 0.4% | -0.1 |
In January, Mixed Purpose led Training by 6.3 pp (48.3% vs 42.0%). By February, Training had overtaken Mixed Purpose for the first time (45.4% vs 43.9%). By March, the gap had widened to 10 pp (49.9% vs 39.9%).
This crossover represents a fundamental change in how AI companies interact with the web:
The shift gives website owners more control. You can now block the majority of AI training crawling without sacrificing search visibility — something that wasn't possible when Mixed Purpose crawlers dominated.
| Metric | January | February | March |
|---|---|---|---|
| Top 5 companies (Google, Meta, OpenAI, Anthropic, Microsoft) | 84.5% | 82.8% | 80.2% |
| Top 6 companies (+ Apple) | 87.0% | 85.9% | 86.0% |
| Top 4 crawlers share | 74.4% | 73.5% | 72.0% |
The AI crawler market is diversifying. The traditional top five lost 4.3 pp of concentration over Q1, but when you include Apple as a sixth major player, the top-six concentration remained remarkably stable at ~86%. The diversification is happening within the top tier, not from outside it.
| AI Crawler | Jan 2026 | Feb 2026 | Mar 2026 | Q1 Change |
|---|---|---|---|---|
| GPTBot | 5.29% | 5.45% | 5.52% | +0.23 pp |
| CCBot | 4.40% | 4.63% | 4.53% | +0.13 pp |
| ClaudeBot | 4.33% | 4.62% | 4.72% | +0.39 pp |
| Google-Extended | 4.04% | 4.36% | 4.37% | +0.33 pp |
| Googlebot | 4.13% | 3.77% | 3.80% | -0.33 pp |
| Bytespider | 3.69% | 3.73% | 3.67% | -0.02 pp |
| meta-externalagent | 3.24% | 3.26% | 3.34% | +0.10 pp |
| Amazonbot | 2.99% | 3.16% | 3.29% | +0.30 pp |
ClaudeBot saw the largest blocking increase across Q1 at +0.39 pp, overtaking CCBot to become the second-most referenced AI crawler in robots.txt files behind GPTBot. The blocking wave is real but slow — over the entire quarter, GPTBot grew from 5.29% to just 5.52%. More than 94% of domains still allow all AI crawlers unrestricted access. At the current rate, it would take years for blocking rates to reach even 10%.
| AI Crawler | March Traffic Share | March Blocking Rate | Gap (pp) |
|---|---|---|---|
| Googlebot | 31.6% | 3.80% | 27.8 |
| Meta-ExternalAgent | 16.7% | 3.34% | 13.4 |
| GPTBot | 12.0% | 5.52% | 6.5 |
| ClaudeBot | 11.7% | 4.72% | 7.0 |
| Amazonbot | 4.4% | 3.29% | 1.1 |
Meta-ExternalAgent's 13.4 pp gap is the most actionable — it's a dedicated training crawler with no search indexing benefit, yet it's blocked by fewer domains than GPTBot despite generating more traffic.
| Model | Jan 2026 | Feb 2026 | Mar 2026 | Q1 Change |
|---|---|---|---|---|
| Llama 3 8B Instruct | 41.7% | 40.1% | 37.3% | -4.4 pp |
| Stable Diffusion XL Base 1.0 | 13.4% | 13.4% | 12.3% | -1.1 pp |
| Whisper | 8.5% | 8.3% | 7.5% | -1.0 pp |
| Llama 4 Scout 17B | 7.7% | 6.7% | 7.0% | -0.7 pp |
| M2M-100 1.2B | 5.6% | 5.4% | 5.1% | -0.5 pp |
| Llama 3 8B Instruct (AWQ) | 4.7% | 4.9% | 4.4% | -0.3 pp |
| FLUX.1 Schnell | 2.4% | 2.5% | 3.0% | +0.6 pp |
| GPT-OSS 120B | -- | 1.6% | 2.1% | New (+2.1 pp) |
| Whisper Large V3 Turbo | 1.4% | 1.6% | 1.7% | +0.3 pp |
Every top model except FLUX.1 Schnell, GPT-OSS 120B, and Whisper Large V3 Turbo lost share across Q1. Meta's dominance is slowly eroding — from ~60% in January to 53.8% in March. GPT-OSS 120B is the breakout model of Q1, debuting in February and climbing to 2.1% in March. FLUX.1 Schnell is gaining on Stable Diffusion XL, emerging as a credible challenger in the text-to-image space.
The Training Takeover. Training crawlers went from minority (42.0%) to near-majority (49.9%) in a single quarter. The web's content is now primarily being consumed for AI model weights rather than search indexing. Website owners who want to opt out of AI training have clearer tools to do so.
The Google Erosion. Googlebot lost 7.1 pp across Q1. This is almost certainly driven by competitors growing faster rather than Google crawling less, but the proportional shift matters for every website owner's traffic analysis.
The Meta Ascendancy. Meta-ExternalAgent gained +5.1 pp, overtook GPTBot for #2 in February, and finished the quarter at 16.7%. Meta's aggressive Llama training pipeline is driving unprecedented data collection.
Apple's Arrival. Applebot went from 2.5% to 5.8% across Q1, with most growth concentrated in March. Apple is now a top-six AI crawler operator and should be included in any website owner's AI bot management strategy.
The Diversification of Workers AI. The model ecosystem became meaningfully more diverse. Llama 3 8B's share fell from 41.7% to 37.3% as developers adopted newer models and the long tail grew from ~15% to ~20%.
Based on Q1 trends, here's what I expect to see in Q2 2026:
Training crawlers will exceed 55%. The +7.9 pp Q1 trajectory suggests Training could reach 55-57% by June 2026, with Mixed Purpose falling below 35%.
Googlebot will drop below 30%. The current quarterly decline rate would put Googlebot in the 28-30% range by end of Q2.
Applebot will enter the top five. If Apple sustains even half of March's growth rate, Applebot could pass Bingbot (currently 8.2%) by May or June.
Meta-ExternalAgent will plateau around 18-20%. The deceleration trend suggests Meta's growth rate is stabilizing.
GPT-OSS 120B will enter the Workers AI top five. At its current growth rate, it could reach 3-4% by June, approaching FLUX.1 Schnell territory.
Robots.txt blocking will remain below 6% for all crawlers. The slow growth rate means meaningful blocking thresholds remain years away.
Now that April data is in, here's how each Q1 prediction held up. Three of six predictions were confirmed in the very first month of Q2, one missed in the opposite direction, and two are still in flight.
| # | Q1 Prediction | Predicted Window | April 2026 Actual | Verdict |
|---|---|---|---|---|
| 1 | Training crawlers will exceed 55% | By June 2026 | 51.5% (passed 50% threshold) | 🟡 On track |
| 2 | Googlebot will drop below 30% | End of Q2 | 29.96% | ✅ Hit (early) |
| 3 | Applebot will enter the top five | May or June | 9.1%, passed Bingbot | ✅ Hit (early) |
| 4 | Meta-ExternalAgent will plateau at 18-20% | Q2 | Fell to 14.5% | ❌ Miss |
| 5 | GPT-OSS 120B will enter Workers AI top five | By June | 5.5% (#6, near top 5) | 🟡 Near hit |
| 6 | Robots.txt blocking will remain below 6% | Q2 | Methodology shift (see note) | ⚠️ Inconclusive |
The predictions that hit early. Applebot's leap into the top five and Googlebot's sub-30% drop both happened in April rather than May/June. The trajectory of both crawlers in March was steep enough that the Q2 timeline was conservative — Apple's continued buildout of Apple Intelligence appears to be a sustained, not transient, ramp.
The miss that matters. The Meta-ExternalAgent prediction was the most confidently stated and the most decisively wrong. Q1 data showed three months of consecutive gains decelerating — a textbook plateau pattern — but April produced a -2.2 pp contraction, the first decline since tracking began. The simplest hypothesis: Meta completed a major Llama training data refresh and dialed back crawl intensity. The alternative: Meta is shifting traffic to a not-yet-identified user agent. Either way, the lesson is that "decelerating gains" don't always precede a plateau — they can also precede a contraction.
The training prediction is on track. Training reached 51.5% in April, +1.6 pp month-over-month. To hit 55% by June would require a ~+1.7 pp/month pace, which is roughly the current trajectory. This prediction looks likely to land within its window.
The robots.txt prediction is inconclusive. Cloudflare Radar's parsed-corpus refresh changed the share denominator (see methodology note above), so the "below 6%" threshold can't be cleanly evaluated against the new sample. The underlying behavior — slow, selective adoption — does appear to continue.
The takeaway: when a prediction is built on sustained directional momentum (Googlebot decline, Applebot growth), it tends to hit. When it's built on extrapolating a slowdown into a plateau (Meta-ExternalAgent), it's risky — the same deceleration that looks like a soft landing can also look like the leading edge of a reversal.
Based on what I've found in the April 2026 Cloudflare Radar data, here are the actions I'd prioritize:
Add Bytespider and TikTokSpider to your robots.txt review. ByteDance now operates two AI crawlers that together generate 7.3% of all AI bot traffic — making ByteDance the third-largest AI crawler operator globally after Google and Meta. Bytespider almost doubled in April (3.6% → 6.2%), and TikTokSpider made its top-15 debut at 1.1%. If you maintain a list of crawlers covered by your robots.txt, both should now be on it.
Update your Applebot policy — Applebot is now the #5 crawler, not the #6. Apple's crawler reached 9.1% of AI bot traffic in April and passed Bingbot to take the fifth spot. Applebot-Extended also entered the top 10 in robots.txt references this month, suggesting more website operators are adding it. Apple offers Applebot-Extended as a separate user agent for opting out of AI training while maintaining Apple Search visibility.
Distinguish ClaudeBot from Claude-SearchBot. Anthropic now operates two separate crawlers: ClaudeBot (training, 11.6%) and Claude-SearchBot (search, 0.8% — first observed in April). This mirrors OpenAI's GPTBot/OAI-SearchBot split. If you want to allow Claude's search functionality to surface your content while opting out of training, you can now do that with separate directives.
Recognize that training crawlers are now the clear majority. At 51.5%, dedicated training crawlers crossed the 50% threshold for the first time in April. A robots.txt policy that only addresses search crawlers now covers less than half of AI bot traffic on your site.
Don't assume Meta-ExternalAgent will keep growing. Meta's bot fell from 16.7% to 14.5% in April — the first contraction on record. If you've been deferring a meta-externalagent rule because Meta's traffic seemed to keep growing, the trajectory has changed. Meta is still the #2 AI crawler, but the trend isn't a one-way street.
Audit your Googlebot assumptions. Googlebot dropped below 30% for the first time in April. For every 10 AI bot requests hitting your site in January, Google accounted for ~4. In April, it's closer to ~3. The other 7 are increasingly from dedicated training crawlers — not search indexing bots.
Every AI crawler in this report exists because AI companies need fresh, structured web data to power their models and search products. At WebSearchAPI.ai, we sit on the other side of this equation — providing developers and AI agents with a clean, fast, and affordable way to access real-time web data without running their own crawlers.
Here's why this matters in the context of April's trends:
If the data in this report tells you anything, it's that the volume and complexity of AI web crawling is only accelerating — and the balance has now decisively shifted toward dedicated training. WebSearchAPI.ai is purpose-built for developers who want to harness that web intelligence without becoming a crawling operation themselves. Learn more about what a web search API can do for your stack.
An AI crawler (also called an AI bot or AI spider) is an automated program that visits websites to collect content for training artificial intelligence models or powering AI-powered search features. Unlike traditional search engine crawlers that index pages for search results, AI crawlers like GPTBot, ClaudeBot, and Meta-ExternalAgent specifically collect data to train large language models (LLMs). Some crawlers like Googlebot serve both purposes — indexing for search and collecting training data simultaneously. You can identify AI crawlers by their user-agent strings in your server logs or through tools like Cloudflare Radar.
This report is updated monthly with fresh data from Cloudflare Radar AI Insights. Each edition covers a rolling 30-day window and includes month-over-month comparisons so you can track trends over time. Quarterly editions (like this one) also include full-quarter trajectory analysis. Bookmark this page or check back at the beginning of each month for the latest analysis of AI crawler traffic patterns, market share shifts, and robots.txt blocking trends.
Yes. The primary method is adding disallow rules to your robots.txt file for specific AI crawler user agents. For example, adding User-agent: GPTBot followed by Disallow: / will request that OpenAI's crawler stop visiting your site. However, robots.txt is a voluntary protocol — crawlers are not technically required to obey it. As of April 2026, GPTBot and ClaudeBot remain the two most-referenced AI crawlers in robots.txt files, but absolute adoption is still a small fraction of the web. Some CDN providers like Cloudflare also offer dashboard-level controls to block or rate-limit AI bots.
AI training crawlers (like GPTBot, ClaudeBot, and Meta-ExternalAgent) collect web content to build and improve AI models. They typically scrape large volumes of content from many sites. AI search crawlers (like OAI-SearchBot and the newly-observed Claude-SearchBot) fetch specific pages in real time when a user performs a search query through an AI tool like ChatGPT or Claude. The key difference: training crawlers take your content to make the model smarter, while search crawlers fetch your content to answer a specific user question — and may drive traffic back to your site. As of April 2026, training crawling commands a clear majority at 51.5% of all AI bot traffic, while search crawling holds at 7.6%.
Blocking dedicated AI training crawlers like GPTBot, ClaudeBot, Meta-ExternalAgent, Bytespider, or Applebot-Extended will not affect your rankings in Google, Bing, or other traditional search engines. These crawlers are separate from the search indexing bots. However, blocking Googlebot will remove your site from Google Search entirely since Google uses the same crawler for both search indexing and AI training. Google offers a middle ground with the Google-Extended user agent — blocking it opts you out of AI training while keeping your search presence intact. Apple offers the same kind of separation with Applebot-Extended, which entered the top 10 most-referenced robots.txt user agents for the first time in April 2026.
Applebot grew from 5.8% in March to 9.1% in April, taking the #5 spot from Bingbot. The surge is consistent with Apple's continued expansion of Apple Intelligence across its ecosystem. As Apple integrates more AI features into Siri, Safari, and iOS/macOS, the company needs fresh web content to power these capabilities. The April industry data also shows Telecommunications and Internet & Telecom verticals receiving disproportionate Applebot traffic, suggesting Apple is targeting SaaS, telecom, and connectivity content as part of its build-out. Website owners should now treat Applebot as a top-tier AI crawler and add Applebot-Extended directives if they want to opt out of AI training while maintaining Apple Search visibility.
In April 2026, Cloudflare Radar identified a new Anthropic user agent — Claude-SearchBot — operating distinctly from the existing ClaudeBot crawler. ClaudeBot continues to handle training and held essentially flat at 11.6%, while Claude-SearchBot debuted at 0.8% as a search-focused crawler powering Claude's web search functionality. Anthropic now mirrors OpenAI's separation of GPTBot (training) and OAI-SearchBot (search), giving website owners the ability to allow Claude's search to surface their content while opting out of training, or vice versa.
Cloudflare's global network spans 330+ cities in 125+ countries and processes over 81 million HTTP requests per second. Through its Radar platform, Cloudflare identifies and classifies AI bot traffic by analyzing user-agent strings, request patterns, and behavioral signatures across all sites on its network. The data in this report comes from Cloudflare Radar's AI Insights endpoints, which aggregate these signals into share-of-traffic percentages by bot, crawl purpose, industry, and region.
In April 2026, Bytespider had the largest relative growth at +72.6% (3.6% → 6.2%), and Applebot had the largest absolute gain at +3.3 pp (5.8% → 9.1%). Over the rolling four-month window from January to April 2026, Applebot has been the standout grower overall — climbing from 2.5% to 9.1%, a +6.6 pp gain that is the largest absolute increase of any AI crawler this year. Meta-ExternalAgent, which led the Q1 growth ranking, reversed direction in April with a -2.2 pp contraction.
The percentages in this report represent share of identified AI bot requests, not share of total web traffic. Cloudflare Radar tracks the proportion of AI-related crawler activity relative to other AI bots, providing a competitive landscape view. The actual percentage of total web traffic from AI bots varies by website, but industry estimates suggest AI crawlers now account for a meaningful and growing share of overall internet traffic, particularly for content-heavy sites in retail, technology, and media.
Check your server access logs for known AI bot user-agent strings (GPTBot, ClaudeBot, meta-externalagent, Applebot, Bytespider, Amazonbot, etc.). Most web analytics platforms filter out bot traffic by default, so log-level analysis gives the most accurate picture. Cloudflare users can view AI bot activity directly in their dashboard. For a structured approach, consider using a web search API to understand how your content appears in AI-powered search results and ensure your most important pages are properly accessible.
Understanding where this data comes from — and what it can and cannot tell you — is critical for interpreting the trends above. Here's a full breakdown of how Cloudflare Radar collects, classifies, and aggregates the AI crawler data used in this report.
| Metric | Value |
|---|---|
| Global presence | 330 cities in 125+ countries |
| HTTP requests | 81 million/second average, peaks >129 million/second |
| DNS queries | 67 million/second (authoritative + resolver) |
This scale is what makes Cloudflare Radar one of the most comprehensive sources of internet traffic data available. The data in this report comes from two primary sources:
For routing data, Cloudflare also uses RIPE RIS data from RIPE NCC (BGP route collectors).
Cloudflare uses a layered detection system to identify and classify AI crawlers:
💡 Expert Insight: The layered approach matters because not all AI crawlers identify themselves honestly. User-agent matching catches transparent bots like GPTBot and ClaudeBot. Behavioral analysis and honeypots catch crawlers that disguise themselves as regular browsers.
Bots are categorized into these purpose buckets:
⚠️ Warning: Keep these limitations in mind when interpreting the data in this report:
This edition uses data from Cloudflare Radar's AI Insights endpoint (/radar/ai/bots/summary/*), Workers AI inference endpoint (/radar/ai/inference/summary/*), and robots.txt analysis endpoint (/radar/robots_txt/top/user_agents/directive). The April 2026 monthly data covers April 4 through May 4, 2026, with month-over-month comparisons against the prior 30-day window (March 4 - April 4, 2026). The Q1 2026 Quarterly Review compiles data from three consecutive monthly analyses covering January through March 2026.
I queried bot traffic breakdowns by user agent, crawl purpose, industry, and vertical. Workers AI data covers model and task distribution by account share. Robots.txt analysis covers domain-level crawler policies for AI user agents. All percentages represent share of identified AI bot requests (for crawling data) or share of accounts (for Workers AI data), not share of total web traffic.
⚠️ Methodology change for April 2026: Cloudflare Radar's robots.txt parsed-corpus was refreshed in early Q2 2026, changing the share denominator. April robots.txt percentages are computed against 4,102 parsed robots.txt files at the most recent snapshot and are not directly comparable to January–March percentages. This change affects only the robots.txt section; AI bot traffic data, Workers AI data, and industry data use the same methodology as prior editions.
Data source: Cloudflare Radar AI Insights API endpoints (radar.cloudflare.com). Last updated: May 9, 2026.
About the Author: I'm James Bennett, Lead Engineer at WebSearchAPI.ai, where I architect the core retrieval engine enabling LLMs and AI agents to access real-time, structured web data with over 99.9% uptime and sub-second query latency. With a background in distributed systems and search technologies, I've reduced AI hallucination rates by 45% through advanced ranking and content extraction pipelines for RAG systems. My expertise includes AI infrastructure, search technologies, large-scale data integration, and API architecture for real-time AI applications.
Credentials: B.Sc. Computer Science (University of Cambridge), M.Sc. Artificial Intelligence Systems (Imperial College London), Google Cloud Certified Professional Cloud Architect, AWS Certified Solutions Architect, Microsoft Azure AI Engineer, Certified Kubernetes Administrator, TensorFlow Developer Certificate.