---
title: "AI Directives"
description: "Control how AI systems interact with your content using Content-Usage and Content-Signal directives."
canonical_url: "https://nuxtseo.com/docs/robots/guides/ai-directives"
last_updated: "2026-05-25T04:24:06.197Z"
---

AI Directives allow you to express preferences about how AI systems, search engines, and automated tools should interact with your content. Two standards are supported:

- **Content-Usage** - IETF standard with broader automation categories
- **Content-Signal** - Cloudflare's widely-deployed implementation focused on AI use cases

Both can be used together in your robots.txt file and are enforced through the robots.txt protocol.

<callout icon="i-heroicons-cpu-chip" to="/tools/robots-txt-generator">

**Test AI crawler blocking** - Our [Robots.txt Generator](/tools/robots-txt-generator) includes presets for GPTBot, ClaudeBot, and other AI crawlers.

</callout>

<alert type="warning">

**Important:** AI directives rely on voluntary compliance by crawlers and AI systems. They are not enforced by web servers and should be combined with other protection methods for sensitive content.

</alert>

## Content-Usage (IETF aipref-vocab)

The Content-Usage directive follows the [IETF AI Preferences specification](https://ietf-wg-aipref.github.io/drafts/draft-ietf-aipref-vocab.html), providing a standardized way to express automation preferences.

### Categories

<table>
<thead>
  <tr>
    <th>
      Category
    </th>
    
    <th>
      Description
    </th>
    
    <th>
      Example Use Case
    </th>
  </tr>
</thead>

<tbody>
  <tr>
    <td>
      <code>
        bots
      </code>
    </td>
    
    <td>
      Automated Processing
    </td>
    
    <td>
      General bot access and crawling
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        train-ai
      </code>
    </td>
    
    <td>
      Foundation Model Production
    </td>
    
    <td>
      Training large language models
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        ai-output
      </code>
    </td>
    
    <td>
      AI Output
    </td>
    
    <td>
      AI generated responses and content
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        search
      </code>
    </td>
    
    <td>
      Search
    </td>
    
    <td>
      Indexing for search results
    </td>
  </tr>
</tbody>
</table>

### Values

- `y` - Allow this category of use
- `n` - Disallow this category of use

### Syntax

```txt [robots.txt]
User-agent: *
Content-Usage: <category>=<value>[, <category>=<value>]
Content-Usage: /path/ <category>=<value>[, <category>=<value>]
```

### Examples

#### Block AI Training Globally

```txt [robots.txt]
User-agent: *
Allow: /
Content-Usage: train-ai=n
```

#### Allow Bots, Block AI Training

```txt [robots.txt]
User-agent: *
Allow: /
Content-Usage: bots=y, train-ai=n
```

#### Path-Specific Rules

```txt [robots.txt]
User-agent: *
Allow: /
Content-Usage: train-ai=n
Content-Usage: /docs/ train-ai=y
Content-Usage: /api/ train-ai=n
```

### Programmatic Configuration

**Object Format (Recommended)** - Type-safe with autocomplete:

```ts [nuxt.config.ts]
export default defineNuxtConfig({
  robots: {
    groups: [
      {
        userAgent: '*',
        allow: '/',
        contentUsage: {
          'train-ai': 'n'
        }
      }
    ]
  }
})
```

## Content-Signal (Cloudflare/IETF aipref-contentsignals)

Content-Signal is [Cloudflare's implementation](https://blog.cloudflare.com/content-signals-policy/) based on [IETF aipref-contentsignals](https://www.ietf.org/archive/id/draft-romm-aipref-contentsignals-00.html).

### Categories

<table>
<thead>
  <tr>
    <th>
      Category
    </th>
    
    <th>
      Description
    </th>
    
    <th>
      Example Use Case
    </th>
  </tr>
</thead>

<tbody>
  <tr>
    <td>
      <code>
        search
      </code>
    </td>
    
    <td>
      Search Applications
    </td>
    
    <td>
      Indexing for search results and snippets
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        ai-input
      </code>
    </td>
    
    <td>
      AI Input
    </td>
    
    <td>
      RAG, grounding, generative AI search answers
    </td>
  </tr>
  
  <tr>
    <td>
      <code>
        ai-train
      </code>
    </td>
    
    <td>
      AI Training
    </td>
    
    <td>
      Training or fine-tuning AI models
    </td>
  </tr>
</tbody>
</table>

### Values

- `yes` - Allow this category of use
- `no` - Disallow this category of use

### Syntax

```txt [robots.txt]
User-agent: *
Content-Signal: <category>=<value>[, <category>=<value>]
Content-Signal: /path/ <category>=<value>[, <category>=<value>]
```

### Examples

#### Block AI Training, Allow Search

```txt [robots.txt]
User-agent: *
Allow: /
Content-Signal: ai-train=no, search=yes
```

#### Block All AI Usage

```txt [robots.txt]
User-agent: *
Allow: /
Content-Signal: ai-train=no, ai-input=no, search=yes
```

#### Path-Specific Rules

```txt [robots.txt]
User-agent: *
Allow: /
Content-Signal: ai-train=no, search=yes
Content-Signal: /docs/ ai-input=yes
Content-Signal: /api/ ai-train=no, ai-input=no, search=no
```

### Programmatic Configuration

**Object Format (Recommended)** - Type-safe with autocomplete:

```ts [nuxt.config.ts]
export default defineNuxtConfig({
  robots: {
    groups: [
      {
        userAgent: '*',
        allow: '/',
        contentSignal: {
          'ai-train': 'no',
          'search': 'yes'
        }
      }
    ]
  }
})
```

## Vendor-Specific AI Tokens

While `Content-Usage` and `Content-Signal` are emerging standards, some major AI providers offer specific User-Agent tokens to opt-out of AI training while maintaining search visibility.

These are highly effective as they are strictly adhered to by their respective companies.

### Google-Extended

[Google-Extended](https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers#google-extended) is a standalone token that allows you to control whether your content is used to help improve Google's generative AI APIs and services (Gemini, Vertex AI).

- **Does NOT** affect your site's ranking in Google Search.
- **Does NOT** stop Googlebot from crawling your site for indexing.

```txt [robots.txt]
User-agent: Google-Extended
Disallow: /
```

### Applebot-Extended

[Applebot-Extended](https://support.apple.com/en-us/119829) allows you to opt-out of having your website content used to train Apple's foundation models that power generative AI features across Apple products.

- **Does NOT** affect your site's ranking in Apple Search (Spotlight, Siri).
- **Does NOT** stop Applebot from crawling your site.

```txt [robots.txt]
User-agent: Applebot-Extended
Disallow: /
```

### Dataset Crawlers

Some crawlers are specifically designed to build massive datasets used for training many different AI models. Blocking these can be a broad-stroke approach to AI protection.

- **CCBot (Common Crawl)**: Used by many open-source and commercial models (including early GPT versions).
- **Bytespider**: Crawler for ByteDance (TikTok) AI models.
- **Diffbot**: AI-powered knowledge extraction.

```txt [robots.txt]
User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: Diffbot
Disallow: /
```

## Using Both Together

You can use both Content-Usage and Content-Signal in the same robots.txt for comprehensive coverage:

```txt [robots.txt]
User-agent: *
Allow: /
Content-Usage: bots=y, train-ai=n
Content-Signal: ai-train=no, search=yes
```

<code-group>

```ts [Object Format (Recommended)]
export default defineNuxtConfig({
  robots: {
    groups: [
      {
        userAgent: '*',
        allow: '/',
        contentUsage: {
          'train-ai': 'n'
        },
        contentSignal: {
          'ai-train': 'no',
          'search': 'yes'
        }
      }
    ]
  }
})
```

```ts [String Format]
export default defineNuxtConfig({
  robots: {
    groups: [
      {
        userAgent: '*',
        allow: '/',
        contentUsage: ['bots=y, train-ai=n'],
        contentSignal: ['ai-train=no, search=yes']
      }
    ]
  }
})
```

</code-group>

## Examples

### Block All AI Training

<code-group>

```txt [Content-Usage]
User-agent: *
Allow: /
Content-Usage: train-ai=n
```

```txt [Content-Signal]
User-agent: *
Allow: /
Content-Signal: ai-train=no
```

</code-group>

### Documentation-Only Training

<code-group>

```txt [Content-Usage]
User-agent: *
Allow: /
Content-Usage: train-ai=n
Content-Usage: /docs/ train-ai=y
```

```txt [Content-Signal]
User-agent: *
Allow: /
Content-Signal: ai-train=no
Content-Signal: /docs/ ai-train=yes
```

</code-group>
