Back to all summaries

Redirects for AI Training enforces canonical content

Cam Whiteside, David Belson, André Cruz
Agents Week Agents AI Radar Bot Management Developer Platform Developers

AI-Generated Summary: This is an automated summary created using AI. For the full details and context, please read the original post.

Redirects for AI Training Enforces Canonical Content

Cloudflare has introduced a new feature, Redirects for AI Training, to address the issue of AI training crawlers consuming outdated content. These crawlers ingest deprecated documentation at the same rate as current content, despite advisory signals indicating that the content is outdated. The new feature redirects verified AI training crawlers to up-to-date content, leveraging existing canonical tags in HTML.

Key Technical Details

  • Redirects for AI Training operates on two inputs: Cloudflare's cf.verified_bot_category field and the tags already in your HTML.
  • The feature targets the AI Crawler category, which includes bots like GPTBot, ClaudeBot, and Bytespider.
  • When a request arrives from a verified AI Crawler, Cloudflare reads the response HTML and issues a 301 Moved Permanently to the canonical URL if a non-self-referencing canonical tag is present.
  • Human traffic, search indexing, and other automated traffic is unaffected.

Practical Implications for Developers

  • This feature ensures that AI training crawlers consume up-to-date content, reducing the risk of training on outdated information.
  • Developers can take advantage of existing canonical tags in their HTML to enable smooth redirects for AI training crawlers.
  • The feature is available on all paid Cloudflare plans, making it easily accessible to developers.

Want to read the full article?

Read Full Post on Cloudflare Blog