What is PerplexityBot?
PerplexityBot is Perplexity's web crawler that retrieves content for real-time answers. Learn how it works and whether to allow or block it.
PerplexityBot is the web crawler Perplexity AI uses to fetch and index content for generating real-time, cited answers to user queries.
PerplexityBot crawls websites to power Perplexity's answer engine, which serves over 15 million monthly users. Unlike traditional search crawlers that just index pages, PerplexityBot retrieves content that gets directly quoted and cited in AI-generated responses. Allowing PerplexityBot means your content can appear as a source in Perplexity answers.
Deep Dive
PerplexityBot operates differently from both traditional search crawlers like Googlebot and AI training crawlers like GPTBot. Its purpose is retrieval, not training: when a user asks Perplexity a question, PerplexityBot fetches relevant pages in real-time to synthesize an answer with citations. The crawler identifies itself with the user-agent string "PerplexityBot" and respects robots.txt directives. It crawls from documented IP ranges, making it straightforward to identify in your server logs. Perplexity also operates a secondary crawler called "PerplexityBot-User" for user-initiated deep research queries, which you can control separately. What makes PerplexityBot significant is the direct attribution model. When Perplexity cites your content, users see your URL alongside the synthesized answer. Click-through rates vary, but the citation itself provides brand visibility regardless of whether users click. For publishers tracking referral traffic, Perplexity visits appear with the referrer "perplexity.ai" in analytics. The blocking decision is more nuanced than with training crawlers. Block PerplexityBot and you prevent your content from being cited in one of the fastest-growing AI search platforms. Allow it and you participate in a new distribution channel, but one where Perplexity summarizes your content rather than sending users directly to your page. Some publishers have raised concerns about PerplexityBot's crawl behavior, particularly around rate limiting and scraping paywalled content. Perplexity has responded by implementing stricter rate limits and honoring paywall signals. If you're experiencing aggressive crawling, you can set crawl-delay directives in robots.txt. For brands optimizing for AI visibility, PerplexityBot represents a trackable opportunity. Unlike ChatGPT, which doesn't consistently cite sources, Perplexity's citation model means you can directly measure when and how your content appears in AI answers. This makes Perplexity a useful proving ground for understanding AI-driven content distribution.
Why It Matters
PerplexityBot represents the clearest example of how AI search creates new visibility dynamics. With 15M+ monthly active users and growing, Perplexity is becoming a meaningful traffic and brand awareness channel. The decision to allow or block PerplexityBot is a strategic one. Unlike training crawlers where the value exchange is unclear, PerplexityBot offers direct attribution: your content gets cited, your brand gets mentioned, and you can track it happening. For brands building AI visibility strategies, Perplexity provides the clearest feedback loop on what content performs in AI contexts.
Key Takeaways
Retrieval crawler, not training crawler: PerplexityBot fetches content in real-time to generate answers, not to train AI models. Your content is used for retrieval-augmented generation, with citations pointing back to your pages.
Citations provide measurable brand visibility: Unlike black-box AI training, Perplexity's citation model lets you see exactly when your content is referenced. This creates a trackable metric for AI visibility that most other platforms don't offer.
Blocking means no Perplexity citations: If you block PerplexityBot via robots.txt, your content won't appear in Perplexity answers at all. This is a binary choice with direct visibility implications.
Separate controls for main and user crawlers: PerplexityBot and PerplexityBot-User are distinct user agents. You can allow general crawling while blocking intensive user-initiated research requests, giving you granular control.
Frequently Asked Questions
What is PerplexityBot?
PerplexityBot is the web crawler operated by Perplexity AI to fetch content for its answer engine. When users ask Perplexity questions, PerplexityBot retrieves relevant web pages so Perplexity can synthesize answers and cite sources. It identifies itself with the user-agent string "PerplexityBot" and respects robots.txt.
How do I block PerplexityBot?
Add these lines to your robots.txt file: "User-agent: PerplexityBot" followed by "Disallow: /" on the next line. To also block user-initiated research queries, add the same for "User-agent: PerplexityBot-User". The crawler checks robots.txt and should stop crawling within a few days.
Should I allow or block PerplexityBot?
It depends on your priorities. Allowing PerplexityBot means your content can be cited in Perplexity answers, providing brand visibility and potential referral traffic. Blocking prevents this visibility but also stops Perplexity from summarizing your content. Unlike training crawlers, the value exchange here is more direct and measurable.
What's the difference between PerplexityBot and GPTBot?
PerplexityBot fetches content for real-time answer generation with citations, while GPTBot crawls for OpenAI's model training. PerplexityBot creates immediate, trackable visibility through citations. GPTBot's impact is indirect and harder to measure since it feeds training data rather than generating cited answers.
Does PerplexityBot respect rate limits?
Yes, PerplexityBot honors crawl-delay directives in robots.txt. If you're experiencing heavy crawling, add "Crawl-delay: 10" (or your preferred seconds) under the PerplexityBot user-agent rules. Perplexity has also implemented server-side rate limiting following publisher feedback.
How can I see if PerplexityBot is crawling my site?
Check your server access logs for the user-agent string "PerplexityBot" or "PerplexityBot-User". Most analytics platforms also show bot traffic if you have bot filtering disabled. You'll see requests from Perplexity's documented IP ranges accessing your pages.