Robots.txt in Vue & Nuxt
Introduction
The
✅ Good for:
- Blocking large site sections (e.g., /admin/*)
- Managing crawler bandwidth on heavy pages (e.g., search, infinite scroll)
- Preventing crawling of development sites
❌ Don't use for:
- Protecting sensitive data (crawlers can ignore rules)
- Individual page indexing (use meta robots instead)
- Removing existing pages from search results
Implementing
Quick Setup
To get started quickly with a static
public/
robots.txt
Add your rules:
# Allow all crawlers
User-agent: *
Disallow:
# Optionally point to your sitemap
Sitemap: https://mysite.com/sitemap.xml
Dynamic Implementation
In some cases, you may prefer a dynamic
// example using Vite SSR
function createServer() {
const app = express()
// ..
app.get('/robots.txt', (req, res) => {
const robots = `
User-agent: *
Disallow: /admin
`
res.type('text/plain').send(robots)
})
// ..
}
import { getRequestHost } from 'h3'
// server/routes/robots.txt.ts
export default defineEventHandler((e) => {
const host = getRequestHost(e)
return host.includes('staging')
? 'User-agent: *\nDisallow: /'
: 'User-agent: *\nDisallow:'
})
Using Nuxt? The Nuxt Robots module can handle this automatically.
Understanding robots.txt
The
# Define which crawler these rules apply to
User-agent: *
# Block access to specific paths
Disallow: /admin
# Allow access to specific paths (optional)
Allow: /admin/public
# Point to your sitemap
Sitemap: https://mysite.com/sitemap.xml
User-agent
The
# All crawlers
User-agent: *
# Just Googlebot
User-agent: Googlebot
# Multiple specific crawlers
User-agent: Googlebot
User-agent: Bingbot
Disallow: /private
Common crawler user agents:
- Googlebot: Google's crawler
- Bingbot: Microsoft's crawler
- FacebookExternalHit: Facebook's crawler
- GPTBot: OpenAI's crawler
- Claude-Web: Anthropic's crawler
Allow / Disallow
The
User-agent: *
# Block all paths starting with /admin
Disallow: /admin
# Block a specific file
Disallow: /private.html
# Block files with specific extensions
Disallow: /*.pdf$
# Block URL parameters
Disallow: /*?*
Path matching uses simple pattern matching:
* matches any sequence of characters$ matches the end of the URL- Paths are relative to the root domain
Sitemap
The
Sitemap: https://mysite.com/sitemap.xml
# Multiple sitemaps
Sitemap: https://mysite.com/products-sitemap.xml
Sitemap: https://mysite.com/blog-sitemap.xml
Yandex Directives
The Yandex search engine introduced additional directives, of which only
Clean-Param : Removes URL parameters from the URL before crawlingHost : Specifies the host name of the site (unused)Crawl-Delay : Specifies the delay between requests (unused)
If you need to use this, you should target the Yandex user agent:
# Remove URL parameters
User-Agent: Yandex
Clean-Param: param1 param2
Security Considerations
robots.txt is publicly visible - avoid revealing sensitive URL patterns- Not all crawlers follow the rules - see our security guide
SEO Impact
- Blocking search crawlers prevents indexing but doesn't remove existing pages
- For page-level control, use meta robots tags instead
- Blocked resources can affect page rendering and SEO
Common Mistakes
- Blocking CSS/JS/Assets
# ❌ May break page rendering
User-agent: *
Disallow: /assets
Disallow: /css
- Using robots.txt for Authentication
# ❌ Not secure
User-agent: *
Disallow: /admin
- Blocking Site Features
# ❌ Better to use meta robots
User-agent: *
Disallow: /search
Testing
Using Google's Tools
- Visit Google's robots.txt Tester
- Add your site
- Test specific URLs
Common Patterns
Allow Everything (Default)
User-agent: *
Disallow:
Block Everything
Useful for staging or development environments.
User-agent: *
Disallow: /
See our security guide for more on environment protection.
Block AI Crawlers
User-agent: GPTBot
User-agent: Claude-Web
User-agent: CCBot
User-agent: Google-Extended
Disallow: /
Block Search While Allowing Social
# Block search engines
User-agent: Googlebot
User-agent: Bingbot
Disallow: /
# Allow social crawlers
User-agent: facebookexternalhit
User-agent: Twitterbot
Allow: /
Block Heavy Pages
User-agent: *
# Block search results
Disallow: /search
# Block filter pages
Disallow: /products?*
# Block print pages
Disallow: /*/print
Related
- Meta Robots Guide - Page-level crawler control
- Sitemaps Guide - Telling crawlers about your pages
- Security Guide - Protecting from malicious crawlers