---
title: "Robots.txt in Vue"
description: "Robots.txt tells crawlers what they can access. Here's how to set it up in Vue."
canonical_url: "https://nuxtseo.com/learn-seo/vue/controlling-crawlers/robots-txt"
last_updated: "2026-01-29"
---

<key-takeaways>

- robots.txt is advisory. crawlers can ignore it, so never use it for security
- Primary uses: crawl budget optimization and blocking AI training bots
- Distinguish between AI training bots (blockable) and AI search bots (needed for traffic)

</key-takeaways>

The `robots.txt` file controls which parts of your site crawlers can access. [Officially adopted as RFC 9309](https://datatracker.ietf.org/doc/html/rfc9309) in September 2022 after 28 years as a de facto standard, it's primarily used to [manage crawl budget](https://developers.google.com/search/docs/crawling-indexing/large-site-managing-crawl-budget) on large sites and block AI training bots.

Robots.txt is not a security mechanism. [crawlers can ignore it](https://developer.mozilla.org/en-US/docs/Web/Security/Practical_implementation_guides/Robots_txt). For individual page control, use [meta robots tags](/learn-seo/vue/controlling-crawlers/meta-tags) instead.

## Quick Setup

To get started quickly with a static `robots.txt`, add the file in your public directory:

```dir
public/
  robots.txt
```

Add your rules:

```robots-txt [robots.txt]
# Allow all crawlers
User-agent: *
Disallow:

# Optionally point to your sitemap
Sitemap: https://mysite.com/sitemap.xml
```

### Dynamic robots.txt

For environment-specific rules (e.g., blocking all crawlers in staging), generate `robots.txt` server-side:

<code-group>

```ts [Express]
import express from 'express'

const app = express()

app.get('/robots.txt', (req, res) => {
  const isDev = process.env.NODE_ENV !== 'production'
  const robots = isDev
    ? 'User-agent: *\nDisallow: /'
    : 'User-agent: *\nDisallow:\nSitemap: https://mysite.com/sitemap.xml'
  res.type('text/plain').send(robots)
})
```

```ts [Vite]
// server.js for Vite SSR
import express from 'express'

const app = express()

app.use((req, res, next) => {
  if (req.path === '/robots.txt') {
    const isDev = process.env.NODE_ENV !== 'production'
    const robots = isDev
      ? 'User-agent: *\nDisallow: /'
      : 'User-agent: *\nDisallow:\nSitemap: https://mysite.com/sitemap.xml'
    return res.type('text/plain').send(robots)
  }
  next()
})
```

```ts [H3]
import { defineEventHandler, setHeader } from 'h3'

export default defineEventHandler((event) => {
  if (event.path === '/robots.txt') {
    const isDev = process.env.NODE_ENV !== 'production'
    const robots = isDev
      ? 'User-agent: *\nDisallow: /'
      : 'User-agent: *\nDisallow:\nSitemap: https://mysite.com/sitemap.xml'
    setHeader(event, 'Content-Type', 'text/plain')
    return robots
  }
})
```

</code-group>

## Robots.txt Syntax

The `robots.txt` file consists of directives grouped by user agent. Google [uses the most specific matching rule](https://developers.google.com/search/docs/crawling-indexing/robots/robots_txt) based on path length:

```robots-txt [robots.txt]
# Define which crawler these rules apply to
User-agent: *

# Block access to specific paths
Disallow: /admin

# Allow access to specific paths (optional, more specific than Disallow)
Allow: /admin/public

# Point to your sitemap
Sitemap: https://mysite.com/sitemap.xml
```

### User-agent

The `User-agent` directive specifies which crawler the rules apply to:

```robots-txt [robots.txt]
# All crawlers
User-agent: *

# Just Googlebot
User-agent: Googlebot

# Multiple specific crawlers
User-agent: Googlebot
User-agent: Bingbot
Disallow: /private
```

Common crawler user agents (2026):

- [Googlebot](https://developers.google.com/search/docs/advanced/crawling/overview-google-crawlers): Google's search crawler
- [Bingbot](https://ahrefs.com/seo/glossary/bingbot): Microsoft's search crawler
- [Applebot](https://support.apple.com/en-us/106381): Apple's search crawler
- [GPTBot](https://platform.openai.com/docs/bots/overview-of-openai-crawlers): OpenAI's training crawler
- [OAI-SearchBot](https://platform.openai.com/docs/bots/oai-searchbot): OpenAI's search crawler (for ChatGPT Search)
- [ClaudeBot](https://support.anthropic.com/en/articles/9906653-claude-bot-and-crawling): Anthropic's training crawler
- [Applebot-Extended](https://support.apple.com/en-us/119829): Apple's AI training crawler

### Allow / Disallow

The `Allow` and `Disallow` directives control path access:

```robots-txt [robots.txt]
User-agent: *
# Block all paths starting with /admin
Disallow: /admin

# Block a specific file
Disallow: /private.html

# Block files with specific extensions
Disallow: /*.pdf$

# Block URL parameters
Disallow: /*?*
```

Wildcards supported ([RFC 9309](https://datatracker.ietf.org/doc/html/rfc9309)):

- `*`: matches zero or more characters
- `$`: matches the end of the URL
- Paths are case-sensitive and relative to domain root

### Sitemap

The `Sitemap` directive tells crawlers where to find your [sitemap.xml](/learn-seo/vue/controlling-crawlers/sitemaps):

```robots-txt [robots.txt]
Sitemap: https://mysite.com/sitemap.xml

# Multiple sitemaps
Sitemap: https://mysite.com/products-sitemap.xml
Sitemap: https://mysite.com/blog-sitemap.xml
```

### Crawl-Delay (Non-Standard)

`Crawl-Delay` is not part of [RFC 9309](https://datatracker.ietf.org/doc/html/rfc9309). Google ignores it. Bing and Yandex support it:

```robots-txt [robots.txt]
User-agent: Bingbot
Crawl-delay: 10  # seconds between requests
```

For Google, you [manage crawl rate in Search Console](https://developers.google.com/search/docs/crawling-indexing/large-site-managing-crawl-budget).

## Security: Why robots.txt Fails

[Robots.txt is not a security mechanism](https://developer.mozilla.org/en-US/docs/Web/Security/Practical_implementation_guides/Robots_txt). Malicious crawlers ignore it, and listing paths in `Disallow` [reveals their location to attackers](https://www.searchenginejournal.com/robots-txt-security-risks/289719/).

**Common mistake:**

```robots-txt
# ❌ Advertises your admin panel location
User-agent: *
Disallow: /admin
Disallow: /wp-admin
Disallow: /api/internal
```

<danger>

Never use robots.txt to hide sensitive content. Listing paths in Disallow advertises their location to attackers, and malicious bots ignore robots.txt entirely. Use authentication and proper access controls instead.

</danger>

Use [proper authentication](https://developers.google.com/search/docs/crawling-indexing/block-indexing) instead. See our [security guide](/learn-seo/vue/routes-and-rendering/security) for details.

## Crawling vs Indexing

Blocking a URL in `robots.txt` prevents crawling but [doesn't prevent indexing](https://developers.google.com/search/docs/crawling-indexing/robots/intro). If other sites link to the URL, Google can still index it without crawling, showing the URL with no snippet.

To prevent indexing:

- Use [`noindex` meta tag](/learn-seo/vue/controlling-crawlers/meta-tags) (requires allowing crawl)
- Use password protection or authentication
- Return 404/410 status codes

Don't block pages with `noindex` in `robots.txt`. Google can't see the tag if it can't crawl.

## Common Mistakes

### 1. Blocking JavaScript and CSS

[Google needs JavaScript and CSS to render pages](https://developers.google.com/search/docs/crawling-indexing/javascript/javascript-seo-basics). Blocking them breaks indexing:

```robots-txt [robots.txt]
# ❌ Prevents Google from rendering your Vue app
User-agent: *
Disallow: /assets/
Disallow: /*.js$
Disallow: /*.css$
```

Vue apps are JavaScript-heavy. Never block `.js`, `.css`, or `/assets/` from Googlebot.

### 2. Blocking Dev Sites in Production

Copy-pasting a dev `robots.txt` to production blocks all crawlers:

```robots-txt [robots.txt]
# ❌ Accidentally left from staging
User-agent: *
Disallow: /
```

Use [dynamic generation](#dynamic-robots-txt) or environment checks to avoid this.

### 3. Confusing robots.txt with noindex

Blocking pages doesn't remove them from search results. Use [`noindex` meta tags](/learn-seo/vue/controlling-crawlers/meta-tags) for that.

## Testing Your robots.txt

1. Check syntax: Visit `https://yoursite.com/robots.txt` to confirm it loads
2. [Google Search Console robots.txt tester](https://search.google.com/search-console/robots-txt) validates syntax and tests URLs
3. Verify crawlers can access: Check server logs for 200 status on `/robots.txt`

## Common Patterns

### Allow Everything (Default)

```robots-txt
User-agent: *
Disallow:
```

### Block Everything

Useful for staging or development environments.

```robots-txt
User-agent: *
Disallow: /
```

See our [security guide](/learn-seo/vue/routes-and-rendering/security) for more on environment protection.

### Block AI Training Crawlers

Blocking AI training bots is a common practice in 2026. This prevents models from using your content for training but doesn't affect your appearance in search results.

```robots-txt
# Block AI model training
User-agent: GPTBot
User-agent: ClaudeBot
User-agent: Applebot-Extended
User-agent: Google-Extended
User-agent: CCBot
Disallow: /
```

<warning>

Be careful not to block **Search Bots** like `OAI-SearchBot` or `Claude-SearchBot` (unless you want to be invisible in their search products). Blocking `GPTBot` is safe for search visibility; blocking `OAI-SearchBot` removes you from ChatGPT Search.

</warning>

### AI Directives: Content-Usage & Content-Signal

Two emerging standards let you express preferences about how AI systems use your content. without blocking crawlers entirely:

- **Content-Usage** (IETF): Uses `y`/`n` values for `train-ai`
- **Content-Signal** (Cloudflare): Uses `yes`/`no` values for `search`, `ai-input`, `ai-train`

```robots-txt
User-agent: *
Allow: /

# IETF aipref-vocab
Content-Usage: train-ai=n

# Cloudflare Content Signals
Content-Signal: search=yes, ai-input=no, ai-train=no
```

This allows crawlers to access your content for search indexing while blocking AI training and RAG/grounding uses. You can use both together for broader coverage.

<warning>

AI directives rely on voluntary compliance. Crawlers can ignore them. combine with User-agent blocks for stronger protection.

</warning>

### Block Search, Allow Social Sharing

For private sites where you still want [link previews](/learn-seo/vue/mastering-meta/social-sharing):

```robots-txt
# Block search engines
User-agent: Googlebot
User-agent: Bingbot
Disallow: /

# Allow social link preview crawlers
User-agent: facebookexternalhit
User-agent: Twitterbot
User-agent: Slackbot
Allow: /
```

### Optimize Crawl Budget for Large Sites

If you have 10,000+ pages, [block low-value URLs](https://developers.google.com/search/docs/crawling-indexing/large-site-managing-crawl-budget) to focus crawl budget on important content:

```robots-txt
User-agent: *
# Block internal search results
Disallow: /search?
# Block infinite scroll pagination
Disallow: /*?page=
# Block filtered/sorted product pages
Disallow: /products?*sort=
Disallow: /products?*filter=
# Block print versions
Disallow: /*/print
```

Sites under 1,000 pages don't need crawl budget optimization.

## Using Nuxt?

If you're using Nuxt, check out [Nuxt SEO](/docs/nuxt-seo/getting-started/introduction) which handles much of this automatically.

[Learn more about robots.txt in Nuxt →](/learn-seo/nuxt/controlling-crawlers/robots-txt)
