Introduction

As a minimum the only recommended configuration for robots is to disable indexing for non-production environments.

Many sites will never need to configure their robots.txt or robots meta tag beyond this, as the controlling web crawlers is an advanced use case and topic.

However, if you're looking to get the best SEO and performance results, you may consider some of the recipes on this page for your site.

Robots.txt recipes

Blocking Bad Bots

If you're finding your site is getting hit with a lot of bots, you may consider enabling the blockNonSeoBots option.

nuxt.config.ts
export default defineNuxtConfig({
  robots: {
    blockNonSeoBots: true
  }
})

This will block mostly web scrapers, the full list is: Nuclei, WikiDo, Riddler, PetalBot, Zoominfobot, Go-http-client, Node/simplecrawler, CazoodleBot, dotbot/1.0, Gigabot, Barkrowler, BLEXBot, magpie-crawler.

Blocking AI Crawlers

AI crawlers can be beneficial as they can help users finding your site, but for some educational sites or those not interested in being indexed by AI crawlers, you can block them using the blockAIBots option.

nuxt.config.ts
export default defineNuxtConfig({
  robots: {
    blockAiBots: true
  }
})

This will block the following AI crawlers: GPTBot, ChatGPT-User, Claude-Web, anthropic-ai, Applebot-Extended, Bytespider, CCBot, cohere-ai, Diffbot, FacebookBot, Google-Extended, ImagesiftBot, PerplexityBot, OmigiliBot, Omigili

Blocking Privileged Pages

If you have pages that require authentication or are only available to certain users, you should block these from being indexed.

public/_robots.txt
User-agent: *
Disallow: /admin
Disallow: /dashboard

See Config using Robots.txt for more information.

Whitelisting Open Graph Tags

If you have certain pages that you don't want indexed but you still want their Open Graph Tags to be crawled, you can target the specific user-agents.

public/_robots.txt
# Block search engines
User-agent: Googlebot
User-agent: Bingbot
Disallow: /user-profiles

# Allow social crawlers
User-agent: facebookexternalhit
User-agent: Twitterbot
Disallow: /user-profiles

See Config using Robots.txt for more information.

Blocking Search Results

You may consider blocking search results from being indexed, as they can be seen as duplicate content and can be a poor user experience.

public/_robots.txt
User-agent: *
# block search results
Disallow: /*?query=
# block pagination
Disallow: /*?page=
# block sorting
Disallow: /*?sort=
# block filtering
Disallow: /*?filter=
Did this page help you?