Scanner Category
AI Crawler Access Analysis
See exactly which AI crawlers can — and can't — reach your content. We analyse your robots.txt configuration against 40+ known AI bots so you can make informed decisions about AI discoverability.
What It Does
Your robots.txt file is the first thing AI crawlers check before accessing your content. If it blocks them — intentionally or not — your content is invisible to AI search engines like ChatGPT, Perplexity, and Google AI Overviews.
GEO Lantern's AI Crawler Access analysis fetches your robots.txt and evaluates it against over 40 known AI crawler user-agents. We show you the exact access status for each major AI bot: allowed, blocked, or not specifically addressed.
This category accounts for 20% of your AI readiness score. Crucially, it's a binary gatekeeper — if AI crawlers are blocked, nothing else matters because they simply cannot see your content. That's why we make it one of the first things to check.
Major AI Crawlers
Bots We Check For
These are the major AI crawlers — GEO Lantern checks for over 40 in total.
GPTBot
Search & TrainingOpenAI
Powers ChatGPT search and browsing features.
ChatGPT-User
SearchOpenAI
Used when ChatGPT users actively browse the web during conversations.
ClaudeBot
Search & TrainingAnthropic
Crawls content for Claude's web search capabilities.
PerplexityBot
SearchPerplexity AI
Fetches content for Perplexity's real-time search answers.
Bytespider
Search & TrainingByteDance
Powers TikTok search and ByteDance AI products.
Google-Extended
TrainingControls whether your content is used for Gemini and AI training (separate from Googlebot).
Applebot-Extended
TrainingApple
Controls content usage for Apple Intelligence features.
cohere-ai
TrainingCohere
Crawls for Cohere's enterprise AI products and search.
Step by Step
How It Works
We check your robots.txt against every known AI crawler.
Fetch your robots.txt
GEO Lantern retrieves your robots.txt file from the standard location at your domain root.
Parse all directives
We parse every User-agent block, Allow/Disallow rule, and any experimental directives like content-usage or disallow-ai-training.
Check 40+ AI crawlers
Each known AI crawler is evaluated against your rules to determine whether it is allowed, blocked, or has no specific directive.
Report and recommend
You receive a clear breakdown showing the access status of each major AI crawler, with recommendations based on your visibility goals.
FAQ
Frequently Asked Questions
What are AI crawlers?
AI crawlers are automated bots operated by AI companies to fetch web content. Unlike traditional search engine crawlers (like Googlebot), AI crawlers gather content specifically for AI-powered features — search answers, chatbot responses, and AI model training. Major AI crawlers include GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot (Perplexity AI), and Google-Extended (Google).
How do I control which AI crawlers can access my site?
You control AI crawler access through your robots.txt file. Each AI crawler has a specific User-agent name. You can allow or block individual crawlers by adding rules like "User-agent: GPTBot" followed by "Allow: /" or "Disallow: /". This lets you grant access to search-tier crawlers while blocking training-tier crawlers if you prefer.
Should I block or allow AI crawlers?
It depends on your goals. If you want your content to appear in AI-powered search results (ChatGPT, Perplexity, Google AI Overviews), you need to allow the relevant crawlers. Blocking all AI crawlers means your content will not be referenced by these systems. Many site owners choose to allow search-tier crawlers while blocking training-only crawlers.
What is the difference between search-tier and training-tier crawlers?
Search-tier crawlers fetch content to provide real-time answers in AI search products — when someone asks a question and the AI retrieves your page to formulate a response. Training-tier crawlers gather content to train or fine-tune AI models. Some crawlers like GPTBot operate in both tiers. Google-Extended is purely training-tier and is separate from the main Googlebot search crawler.
How many AI crawlers does GEO Lantern check for?
GEO Lantern checks for over 40 known AI crawler user-agents. This includes major crawlers from OpenAI, Anthropic, Google, Apple, Meta, Perplexity, ByteDance, Cohere, and others. We regularly update our crawler database as new AI bots are deployed.
What if my robots.txt doesn't mention AI crawlers at all?
If your robots.txt has no specific rules for AI crawlers, their access depends on your wildcard rules. If you have "User-agent: * / Allow: /", all AI crawlers can access your site. If you have "User-agent: * / Disallow: /", all crawlers (including AI) are blocked. GEO Lantern analyses your complete robots.txt to determine the effective access for each AI bot.
Ready to See Your Score?
Run a free AI readiness scan and discover exactly how AI search engines perceive your website.