Robots.txt in Vue & Nuxt
Introduction
The robots.txt
file lives in your root web directory is a common way to control how crawlers access your site.
✅ Good for:
- Blocking large site sections (e.g., /admin/*)
- Managing crawler bandwidth on heavy pages (e.g., search, infinite scroll)
- Preventing crawling of development sites
❌ Don't use for:
- Protecting sensitive data (crawlers can ignore rules)
- Individual page indexing (use meta robots instead)
- Removing existing pages from search results
Implementing robots.txt
is straightforward, you can either create a static file or a dynamic one for Vue / Nuxt applications.
Quick Setup
To get started quickly with a static robots.txt
, add the file in your public directory:
public/
robots.txt
Add your rules:
# Allow all crawlers
User-agent: *
Disallow:
# Optionally point to your sitemap
Sitemap: https://mysite.com/sitemap.xml
Dynamic Implementation
In some cases, you may prefer a dynamic robots.txt
file. That is, we server-side generate the file based on the environment or other factors.
// example using Vite SSR
function createServer() {
const app = express()
// ..
app.get('/robots.txt', (req, res) => {
const robots = `
User-agent: *
Disallow: /admin
`
res.type('text/plain').send(robots)
})
// ..
}
import { getRequestHost } from 'h3'
// server/routes/robots.txt.ts
export default defineEventHandler((e) => {
const host = getRequestHost(e)
return host.includes('staging')
? 'User-agent: *\nDisallow: /'
: 'User-agent: *\nDisallow:'
})
Using Nuxt? The Nuxt Robots module can handle this automatically.
Understanding robots.txt
The robots.txt
file consists of these main directives:
# Define which crawler these rules apply to
User-agent: *
# Block access to specific paths
Disallow: /admin
# Allow access to specific paths (optional)
Allow: /admin/public
# Point to your sitemap
Sitemap: https://mysite.com/sitemap.xml
User-agent
The User-agent
directive specifies which crawler the rules apply to:
# All crawlers
User-agent: *
# Just Googlebot
User-agent: Googlebot
# Multiple specific crawlers
User-agent: Googlebot
User-agent: Bingbot
Disallow: /private
Common crawler user agents:
- Googlebot: Google's crawler
- Bingbot: Microsoft's crawler
- FacebookExternalHit: Facebook's crawler
- GPTBot: OpenAI's crawler
- Claude-Web: Anthropic's crawler
Allow / Disallow
The Allow
and Disallow
directives control path access:
User-agent: *
# Block all paths starting with /admin
Disallow: /admin
# Block a specific file
Disallow: /private.html
# Block files with specific extensions
Disallow: /*.pdf$
# Block URL parameters
Disallow: /*?*
Path matching uses simple pattern matching:
*
matches any sequence of characters$
matches the end of the URL- Paths are relative to the root domain
Sitemap
The Sitemap
directive tells crawlers where to find your sitemap.xml:
Sitemap: https://mysite.com/sitemap.xml
# Multiple sitemaps
Sitemap: https://mysite.com/products-sitemap.xml
Sitemap: https://mysite.com/blog-sitemap.xml
Yandex Directives
The Yandex search engine introduced additional directives, of which only Clean-Param
is useful.
Clean-Param
: Removes URL parameters from the URL before crawlingHost
: Specifies the host name of the site (unused)Crawl-Delay
: Specifies the delay between requests (unused)
If you need to use this, you should target the Yandex user agent:
# Remove URL parameters
User-Agent: Yandex
Clean-Param: param1 param2
Security Considerations
robots.txt
is publicly visible - avoid revealing sensitive URL patterns- Not all crawlers follow the rules - see our security guide
SEO Impact
- Blocking search crawlers prevents indexing but doesn't remove existing pages
- For page-level control, use meta robots tags instead
- Blocked resources can affect page rendering and SEO
Common Mistakes
- Blocking CSS/JS/Assets
# ❌ May break page rendering
User-agent: *
Disallow: /assets
Disallow: /css
- Using robots.txt for Authentication
# ❌ Not secure
User-agent: *
Disallow: /admin
- Blocking Site Features
# ❌ Better to use meta robots
User-agent: *
Disallow: /search
Testing
Using Google's Tools
- Visit Google's robots.txt Tester
- Add your site
- Test specific URLs
Common Patterns
Allow Everything (Default)
User-agent: *
Disallow:
Block Everything
Useful for staging or development environments.
User-agent: *
Disallow: /
See our security guide for more on environment protection.
Block AI Crawlers
User-agent: GPTBot
User-agent: Claude-Web
User-agent: CCBot
User-agent: Google-Extended
Disallow: /
Block Search While Allowing Social
# Block search engines
User-agent: Googlebot
User-agent: Bingbot
Disallow: /
# Allow social crawlers
User-agent: facebookexternalhit
User-agent: Twitterbot
Allow: /
Block Heavy Pages
User-agent: *
# Block search results
Disallow: /search
# Block filter pages
Disallow: /products?*
# Block print pages
Disallow: /*/print
Related
- Meta Robots Guide - Page-level crawler control
- Sitemaps Guide - Telling crawlers about your pages
- Security Guide - Protecting from malicious crawlers