Disabling Indexing
When viewing your sitemap.xml for the first time, you may notice some URLs you don't want to be included.
These URLs are most likely coming from Application Sources.
If you don't want to disable these sources but want to remove these URLs you have a couple of options.
Disabling Page Indexing
If you don't want a URL in your sitemap because you don't want search engines to crawl it,
then you can make use of the index
route rule.
To actually block search engines, you will need to use this with the Nuxt Robots module.
Disabling indexing for a pattern of URLs
If you have a pattern of URLs that you want hidden from search you can use route rules.
export default defineNuxtConfig({
routeRules: {
// Don't add any /secret/** URLs to the sitemap.xml
'/secret/**': { robots: false },
}
})
Disabling indexing for a Page
If you just have some specific pages, you can use the experimental defineRouteRules
<script setup>
defineRouteRules({
robots: false
})
</script>
Filter URLs with include / exclude
For all other cases, you can use the include
and exclude
module options to filter URLs.
export default defineNuxtConfig({
sitemap: {
// exclude all URLs that start with /secret
exclude: ['/secret/**'],
// include all URLs that start with /public
include: ['/public/**'],
}
})
Either option supports either an array of strings, RegExp objects or a { regex: string }
object.
Providing strings will use the route rules path matching which does not support variable path segments in front of static ones.
For example, /foo/**
will work but /foo/**/bar
will not. To get around this you should use regex.
Regex Filtering
Filtering using regex is more powerful and can be used to match more complex patterns. It's recommended to pass a
RegExp
object explicitly.
export default defineNuxtConfig({
sitemap: {
exclude: [
// exclude /foo/**/bar using regex
new RegExp('/foo/.*/bar')
],
}
})