---
title: "Robots.txt in Vue · Nuxt SEO"
meta:
  "og:description": "Robots.txt tells crawlers what they can access. Here's how to set it up in Vue."
  "og:title": "Robots.txt in Vue · Nuxt SEO"
  author: "Harlan Wilton"
  description: "Robots.txt tells crawlers what they can access. Here's how to set it up in Vue."
---

# **Robots.txt in Vue**

Robots.txt tells crawlers what they can access. Here's how to set it up in Vue.

[![Harlan Wilton](https://avatars.githubusercontent.com/u/5326365?v=4)Harlan Wilton](https://x.com/harlan-zw)10 mins read Published **Nov 3, 2024** Updated **Jan 29, 2026**

**What you'll learn**

- robots.txt is advisory. crawlers can ignore it, so never use it for security
- Primary uses: crawl budget optimization and blocking AI training bots
- Distinguish between AI training bots (blockable) and AI search bots (needed for traffic)

The `robots.txt` file controls which parts of your site crawlers can access. [**Officially adopted as RFC 9309**](https://datatracker.ietf.org/doc/html/rfc9309) in September 2022 after 28 years as a de facto standard, it's primarily used to [**manage crawl budget**](https://developers.google.com/search/docs/crawling-indexing/large-site-managing-crawl-budget) on large sites and block AI training bots.

Robots.txt is not a security mechanism. [**crawlers can ignore it**](https://developer.mozilla.org/en-US/docs/Web/Security/Practical_implementation_guides/Robots_txt). For individual page control, use [**meta robots tags**](https://nuxtseo.com/learn-seo/vue/controlling-crawlers/meta-tags) instead.

## [Quick Setup](#quick-setup)

To get started quickly with a static `robots.txt`, add the file in your public directory:

```
public/
  robots.txt
```

Add your rules:

robots.txt

```
# Allow all crawlers
User-agent: *
Disallow:

# Optionally point to your sitemap
Sitemap: https://mysite.com/sitemap.xml
```

### [Dynamic robots.txt](#dynamic-robotstxt)

For environment-specific rules (e.g., blocking all crawlers in staging), generate `robots.txt` server-side:

```
import express from 'express'

const app = express()

app.get('/robots.txt', (req, res) => {
  const isDev = process.env.NODE_ENV !== 'production'
  const robots = isDev
    ? 'User-agent: *\nDisallow: /'
    : 'User-agent: *\nDisallow:\nSitemap: https://mysite.com/sitemap.xml'
  res.type('text/plain').send(robots)
})
```

```
// server.js for Vite SSR
import express from 'express'

const app = express()

app.use((req, res, next) => {
  if (req.path === '/robots.txt') {
    const isDev = process.env.NODE_ENV !== 'production'
    const robots = isDev
      ? 'User-agent: *\nDisallow: /'
      : 'User-agent: *\nDisallow:\nSitemap: https://mysite.com/sitemap.xml'
    return res.type('text/plain').send(robots)
  }
  next()
})
```

```
import { defineEventHandler, setHeader } from 'h3'

export default defineEventHandler((event) => {
  if (event.path === '/robots.txt') {
    const isDev = process.env.NODE_ENV !== 'production'
    const robots = isDev
      ? 'User-agent: *\nDisallow: /'
      : 'User-agent: *\nDisallow:\nSitemap: https://mysite.com/sitemap.xml'
    setHeader(event, 'Content-Type', 'text/plain')
    return robots
  }
})
```

## [Robots.txt Syntax](#robotstxt-syntax)

The `robots.txt` file consists of directives grouped by user agent. Google [**uses the most specific matching rule**](https://developers.google.com/search/docs/crawling-indexing/robots/robots_txt) based on path length:

robots.txt

```
# Define which crawler these rules apply to
User-agent: *

# Block access to specific paths
Disallow: /admin

# Allow access to specific paths (optional, more specific than Disallow)
Allow: /admin/public

# Point to your sitemap
Sitemap: https://mysite.com/sitemap.xml
```

### [User-agent](#user-agent)

The `User-agent` directive specifies which crawler the rules apply to:

robots.txt

```
# All crawlers
User-agent: *

# Just Googlebot
User-agent: Googlebot

# Multiple specific crawlers
User-agent: Googlebot
User-agent: Bingbot
Disallow: /private
```

Common crawler user agents (2026):

- [**Googlebot**](https://developers.google.com/search/docs/advanced/crawling/overview-google-crawlers): Google's search crawler
- [**Bingbot**](https://ahrefs.com/seo/glossary/bingbot): Microsoft's search crawler
- [**Applebot**](https://support.apple.com/en-us/106381): Apple's search crawler
- [**GPTBot**](https://platform.openai.com/docs/bots/overview-of-openai-crawlers): OpenAI's training crawler
- [**OAI-SearchBot**](https://platform.openai.com/docs/bots/oai-searchbot): OpenAI's search crawler (for ChatGPT Search)
- [**ClaudeBot**](https://support.anthropic.com/en/articles/9906653-claude-bot-and-crawling): Anthropic's training crawler
- [**Applebot-Extended**](https://support.apple.com/en-us/119829): Apple's AI training crawler

### [Allow / Disallow](#allow-disallow)

The `Allow` and `Disallow` directives control path access:

robots.txt

```
User-agent: *
# Block all paths starting with /admin
Disallow: /admin

# Block a specific file
Disallow: /private.html

# Block files with specific extensions
Disallow: /*.pdf$

# Block URL parameters
Disallow: /*?*
```

Wildcards supported ([**RFC 9309**](https://datatracker.ietf.org/doc/html/rfc9309)):

- `*` . matches zero or more characters
- `$` . matches the end of the URL
- Paths are case-sensitive and relative to domain root

### [Sitemap](#sitemap)

The `Sitemap` directive tells crawlers where to find your [**sitemap.xml**](https://nuxtseo.com/learn-seo/vue/controlling-crawlers/sitemaps):

robots.txt

```
Sitemap: https://mysite.com/sitemap.xml

# Multiple sitemaps
Sitemap: https://mysite.com/products-sitemap.xml
Sitemap: https://mysite.com/blog-sitemap.xml
```

### [Crawl-Delay (Non-Standard)](#crawl-delay-non-standard)

`Crawl-Delay` is not part of [**RFC 9309**](https://datatracker.ietf.org/doc/html/rfc9309). Google ignores it. Bing and Yandex support it:

robots.txt

```
User-agent: Bingbot
Crawl-delay: 10  # seconds between requests
```

For Google, [**crawl rate is managed in Search Console**](https://developers.google.com/search/docs/crawling-indexing/large-site-managing-crawl-budget).

## [Security: Why robots.txt Fails](#security-why-robotstxt-fails)

[**Robots.txt is not a security mechanism**](https://developer.mozilla.org/en-US/docs/Web/Security/Practical_implementation_guides/Robots_txt). Malicious crawlers ignore it, and listing paths in `Disallow` [**reveals their location to attackers**](https://www.searchenginejournal.com/robots-txt-security-risks/289719/).

**Common mistake:**

```
# ❌ Advertises your admin panel location
User-agent: *
Disallow: /admin
Disallow: /wp-admin
Disallow: /api/internal
```

Never use robots.txt to hide sensitive content. Listing paths in Disallow advertises their location to attackers, and malicious bots ignore robots.txt entirely. Use authentication and proper access controls instead.

Use [**proper authentication**](https://developers.google.com/search/docs/crawling-indexing/block-indexing) instead. See our [**security guide**](https://nuxtseo.com/learn-seo/vue/routes-and-rendering/security) for details.

## [Crawling vs Indexing](#crawling-vs-indexing)

Blocking a URL in `robots.txt` prevents crawling but [**doesn't prevent indexing**](https://developers.google.com/search/docs/crawling-indexing/robots/intro). If other sites link to the URL, Google can still index it without crawling, showing the URL with no snippet.

To prevent indexing:

- Use [`noindex`** meta tag**](https://nuxtseo.com/learn-seo/vue/controlling-crawlers/meta-tags) (requires allowing crawl)
- Use password protection or authentication
- Return 404/410 status codes

Don't block pages with `noindex` in `robots.txt`. Google can't see the tag if it can't crawl.

## [Common Mistakes](#common-mistakes)

### [1. Blocking JavaScript and CSS](#_1-blocking-javascript-and-css)

[**Google needs JavaScript and CSS to render pages**](https://developers.google.com/search/docs/crawling-indexing/javascript/javascript-seo-basics). Blocking them breaks indexing:

robots.txt

```
# ❌ Prevents Google from rendering your Vue app
User-agent: *
Disallow: /assets/
Disallow: /*.js$
Disallow: /*.css$
```

Vue apps are JavaScript-heavy. Never block `.js`, `.css`, or `/assets/` from Googlebot.

### [2. Blocking Dev Sites in Production](#_2-blocking-dev-sites-in-production)

Copy-pasting a dev `robots.txt` to production blocks all crawlers:

robots.txt

```
# ❌ Accidentally left from staging
User-agent: *
Disallow: /
```

Use [**dynamic generation**](#dynamic-robots-txt) or environment checks to avoid this.

### [3. Confusing robots.txt with noindex](#_3-confusing-robotstxt-with-noindex)

Blocking pages doesn't remove them from search results. Use [`noindex`** meta tags**](https://nuxtseo.com/learn-seo/vue/controlling-crawlers/meta-tags) for that.

## [Testing Your robots.txt](#testing-your-robotstxt)

1. Check syntax: Visit `https://yoursite.com/robots.txt` to confirm it loads
2. [**Google Search Console robots.txt tester**](https://search.google.com/search-console/robots-txt) validates syntax and tests URLs
3. Verify crawlers can access: Check server logs for 200 status on `/robots.txt`

## [Common Patterns](#common-patterns)

### [Allow Everything (Default)](#allow-everything-default)

```
User-agent: *
Disallow:
```

### [Block Everything](#block-everything)

Useful for staging or development environments.

```
User-agent: *
Disallow: /
```

See our [**security guide**](https://nuxtseo.com/learn-seo/vue/routes-and-rendering/security) for more on environment protection.

### [Block AI Training Crawlers](#block-ai-training-crawlers)

Blocking AI training bots is a common practice in 2026. This prevents your content from being used to train models but doesn't affect your appearance in search results.

```
# Block AI model training
User-agent: GPTBot
User-agent: ClaudeBot
User-agent: Applebot-Extended
User-agent: Google-Extended
User-agent: CCBot
Disallow: /
```

Be careful not to block **Search Bots** like `OAI-SearchBot` or `Claude-SearchBot` (unless you want to be invisible in their search products). Blocking `GPTBot` is safe for search visibility; blocking `OAI-SearchBot` removes you from ChatGPT Search.

### [AI Directives: Content-Usage & Content-Signal](#ai-directives-content-usage-content-signal)

Two emerging standards let you express preferences about how AI systems use your content. without blocking crawlers entirely:

- **[**Content-Usage**](https://ietf-wg-aipref.github.io/drafts/draft-ietf-aipref-vocab.html)** (IETF) . Uses `y`/`n` values for `train-ai`
- **[**Content-Signal**](https://contentsignals.org/)** (Cloudflare) . Uses `yes`/`no` values for `search`, `ai-input`, `ai-train`

```
User-agent: *
Allow: /

# IETF aipref-vocab
Content-Usage: train-ai=n

# Cloudflare Content Signals
Content-Signal: search=yes, ai-input=no, ai-train=no
```

This allows crawlers to access your content for search indexing while blocking AI training and RAG/grounding uses. Both can be used together for broader coverage.

AI directives rely on voluntary compliance. Crawlers can ignore them. combine with User-agent blocks for stronger protection.

### [Block Search, Allow Social Sharing](#block-search-allow-social-sharing)

For private sites where you still want [**link previews**](https://nuxtseo.com/learn-seo/vue/mastering-meta/social-sharing):

```
# Block search engines
User-agent: Googlebot
User-agent: Bingbot
Disallow: /

# Allow social link preview crawlers
User-agent: facebookexternalhit
User-agent: Twitterbot
User-agent: Slackbot
Allow: /
```

### [Optimize Crawl Budget for Large Sites](#optimize-crawl-budget-for-large-sites)

If you have 10,000+ pages, [**block low-value URLs**](https://developers.google.com/search/docs/crawling-indexing/large-site-managing-crawl-budget) to focus crawl budget on important content:

```
User-agent: *
# Block internal search results
Disallow: /search?
# Block infinite scroll pagination
Disallow: /*?page=
# Block filtered/sorted product pages
Disallow: /products?*sort=
Disallow: /products?*filter=
# Block print versions
Disallow: /*/print
```

Sites under 1,000 pages don't need crawl budget optimization.

## [Using Nuxt?](#using-nuxt)

If you're using Nuxt, check out [**Nuxt SEO**](https://nuxtseo.com/docs/nuxt-seo/getting-started/introduction) which handles much of this automatically.

[**Learn more about robots.txt in Nuxt →**](https://nuxtseo.com/learn-seo/nuxt/controlling-crawlers/robots-txt)

**Quick Check**

**Does robots.txt block malicious crawlers from accessing your site?**

- `Yes, it blocks all crawlers` - Malicious crawlers ignore robots.txt entirely
- `No, it's just a polite suggestion` - Correct. Robots.txt is advisory only. use authentication for real security
- `Only if you use Disallow: /` - Malicious bots don't respect any robots.txt rules

---

[**Controlling Crawlers** Manage how search engines crawl and index your Vue app. Configure robots.txt, sitemaps, canonical URLs, and redirects for better SEO.](https://nuxtseo.com/learn-seo/vue/controlling-crawlers) [**Sitemaps** Generate sitemaps for Vue SPAs using Vite plugins, server-side rendering, or build-time generation.](https://nuxtseo.com/learn-seo/vue/controlling-crawlers/sitemaps)

**On this page**

- [Quick Setup](#quick-setup)
- [Robots.txt Syntax](#robotstxt-syntax)
- [Security: Why robots.txt Fails](#security-why-robotstxt-fails)
- [Crawling vs Indexing](#crawling-vs-indexing)
- [Common Mistakes](#common-mistakes)
- [Testing Your robots.txt](#testing-your-robotstxt)
- [Common Patterns](#common-patterns)
- [Using Nuxt?](#using-nuxt)