How Nuxt Robots Works
Nuxt Robots tells robots (crawlers) how to behave by creating a robots.txt
file for you, adding a X-Robots-Tag
header and <meta name="robots">
tag to your site
where appropriate.
One important behaviour to control is blocking Google from indexing pages to:
- Prevent duplicate content issues
- Prevent wasting crawl budget
Robots.txt
For robots to understand how they can access your site, they will first check for a robots.txt
file.
public
└── robots.txt
This file is generated differently depending on the environment:
- When deploying using
nuxi generate
or thenitro.prerender.routes
rule, this is a static file. - Otherwise, it's handled by the server and generated at runtime when requested.
When indexing is disabled a robots.txt
will be generated with the following content:
User-agent: *
Disallow: /
This blocks all bots from indexing your site.
X-Robots-Tag
Header and <meta name="robots">
In some situations, the robots.txt becomes too restrictive to provide the level of control you need to manage your site's indexing.
For this reason, the module by default will provide a X-Robots-Tag
header and <meta name="robots">
tag.
These are applied using the following logic:
X-Robots-Tag
header - Route Rules are implemented for all modes, otherwise SSR only. This will only be added when indexing has been disabled for the route.<meta name="robots">
- SSR only, will always be added
Robot Rules
Default values for the robots
rule depending on the mode.
For indexable routes the following is used:
<meta name="robots" content="index, follow, max-image-preview:large, max-snippet:-1, max-video-preview:-1">
Besides giving robots the go-ahead, this also requests that Google:
Choose the snippet length that it believes is most effective to help users discover your content and direct users to your site."
You can learn more on the Robots Meta Tag documentation, feel free
to change this to suit your needs using robotsEnabledValue
.
For non-indexable routes the following is used:
<meta name="robots" content="noindex, nofollow">
This will tell robots to not index the page.
Development Environment
The module by default will disable indexing in development environments. This is for safety, as you don't want your development environment to be indexed by search engines.
# Block all bots
User-agent: *
Disallow: /
Production Environments
For production environments, the module will generate a robots.txt
file that allows all bots.
Out-of-the-box, this will be the following:
User-agent: *
Disallow:
This tells all bots that they can index your entire site.