Protecting Nuxt Apps from Malicious Crawlers

Robots.txt is a polite suggestion. Malicious crawlers ignore it. Here's how to actually protect your Nuxt app.
Harlan WiltonHarlan Wilton8 mins read Published Updated
What you'll learn
  • Robots.txt is a suggestion, not security. use server middleware and route rules for actual protection
  • Block non-production environments with X-Robots-Tag: noindex header via middleware
  • Rate limiting and security headers belong in routeRules, not robots.txt

Robots.txt and meta robots tags are polite suggestions. Malicious crawlers ignore them.

You need actual security: block non-production environments, protect development assets, rate limit aggressive crawlers, authenticate sensitive routes, use HTTPS everywhere. Don't rely on robots.txt for sensitive data, IP blocking alone (easily bypassed), or user-agent detection (trivial to fake).

Quick Setup

The easiest way to handle crawler blocking in Nuxt is with the Robots module:

export default defineNuxtConfig({
  modules: ['@nuxtjs/robots'],
  robots: {
    // Automatically blocks non-production environments
    disallow: process.env.NODE_ENV !== 'production' ? ['*'] : [],
    // Block specific paths
    groups: [
      { userAgent: '*', disallow: ['/admin', '/dashboard'] }
    ]
  }
})
Robots v5.6.8
7.9M
505
Tame the robots crawling and indexing your site with ease.

For additional security beyond crawler blocking, use server middleware and route rules:

server/middleware/security.ts
export default defineEventHandler((event) => {
  // Block non-production environments
  if (process.env.NODE_ENV !== 'production') {
    setHeader(event, 'X-Robots-Tag', 'noindex, nofollow')
  }

  // Enforce HTTPS
  const proto = getRequestHeader(event, 'x-forwarded-proto')
  if (proto === 'http') {
    return sendRedirect(event, `https://${getRequestHost(event)}${event.path}`, 301)
  }
})
nuxt.config.ts
export default defineNuxtConfig({
  nitro: {
    routeRules: {
      '/**': {
        headers: {
          'X-Frame-Options': 'DENY',
          'X-Content-Type-Options': 'nosniff',
          'Referrer-Policy': 'strict-origin-when-cross-origin'
        }
      }
    }
  }
})

For rate limiting, use a dedicated module like nuxt-rate-limit or nuxt-api-shield:

nuxt.config.ts
export default defineNuxtConfig({
  modules: ['nuxt-rate-limit'],
  nuxtRateLimit: {
    routes: {
      '/api/*': { maxRequests: 100, intervalSeconds: 60 }
    }
  }
})

Or implement manually:

server/middleware/rate-limit.ts
const requestCounts = new Map<string, number>()

export default defineEventHandler((event) => {
  const ip = getRequestIP(event)
  const count = requestCounts.get(ip) || 0

  if (count > 100) {
    throw createError({
      statusCode: 429,
      message: 'Too Many Requests'
    })
  }

  requestCounts.set(ip, count + 1)
})

Environment Protection

Development & Staging

The Robots module automatically handles non-production blocking. It detects preview/staging environments and adds noindex headers:

nuxt.config.ts
export default defineNuxtConfig({
  modules: ['@nuxtjs/robots'],
  robots: {
    // Automatically detects non-production environments
    // Or manually control:
    blockNonSeoCrawlers: true,
    disallow: import.meta.env.VERCEL_ENV === 'preview' ? ['*'] : []
  }
})

For custom logic or basic auth, use middleware:

server/middleware/block-non-production.ts
export default defineEventHandler((event) => {
  const isProd = process.env.NODE_ENV === 'production'
  const isMainDomain = getRequestHost(event) === 'mysite.com'

  if (!isProd || !isMainDomain) {
    setHeader(event, 'X-Robots-Tag', 'noindex, nofollow')

    // Optional: basic auth for staging
    const auth = getRequestHeader(event, 'authorization')
    if (!auth) {
      setResponseStatus(event, 401)
      setHeader(event, 'WWW-Authenticate', 'Basic')
      return 'Authentication required'
    }
  }
})

Sensitive Routes

Use the Robots module to block indexing of sensitive paths:

nuxt.config.ts
export default defineNuxtConfig({
  modules: ['@nuxtjs/robots'],
  robots: {
    disallow: ['/admin', '/dashboard', '/user']
  }
})

For authentication and custom protection logic, add middleware:

server/middleware/protect-routes.ts
export default defineEventHandler((event) => {
  const protectedPaths = ['/admin', '/dashboard', '/user']

  if (protectedPaths.some(path => event.path.startsWith(path))) {
    // Ensure user is authenticated
    if (!event.context.auth?.user) {
      return sendRedirect(event, '/login')
    }

    // Block indexing of protected content
    setHeader(event, 'X-Robots-Tag', 'noindex, nofollow')
  }
})

Or use route rules for static protection:

nuxt.config.ts
export default defineNuxtConfig({
  nitro: {
    routeRules: {
      '/admin/**': {
        headers: {
          'X-Robots-Tag': 'noindex, nofollow'
        }
      }
    }
  }
})

Crawler Identification

Good vs Bad Crawlers

Identify legitimate crawlers through:

  • Reverse DNS lookup
  • IP verification
  • Behavior patterns
  • Request rate
server/utils/verify-crawler.ts
import { lookup } from 'node:dns/promises'

export async function isLegitCrawler(ip: string, userAgent: string) {
  // Example: Verify Googlebot
  if (userAgent.includes('Googlebot')) {
    const [hostname] = await lookup(ip)
    return hostname.endsWith('googlebot.com')
  }
  return false
}

Rate Limiting

Use nuxt-security for built-in rate limiting:

nuxt.config.ts
export default defineNuxtConfig({
  modules: ['nuxt-security'],
  security: {
    rateLimiter: {
      tokensPerInterval: 100,
      interval: 60000, // 1 minute
      headers: true
    }
  }
})

Or use nuxt-rate-limit for simpler API-focused rate limiting:

nuxt.config.ts
export default defineNuxtConfig({
  modules: ['nuxt-rate-limit'],
  nuxtRateLimit: {
    routes: {
      '/api/*': { maxRequests: 100, intervalSeconds: 60 },
      '/api/auth/*': { maxRequests: 10, intervalSeconds: 60 }
    }
  }
})

For custom tiered logic, implement manually:

server/middleware/rate-limit.ts
const requestCounts = new Map<string, { count: number, resetAt: number }>()

export default defineEventHandler((event) => {
  const ip = getRequestIP(event)
  const now = Date.now()
  const windowMs = 15 * 60 * 1000 // 15 minutes

  const record = requestCounts.get(ip)

  if (!record || now > record.resetAt) {
    requestCounts.set(ip, { count: 1, resetAt: now + windowMs })
    return
  }

  // Different limits for different paths
  const maxRequests = event.path.startsWith('/api') ? 100 : 1000

  if (record.count > maxRequests) {
    throw createError({
      statusCode: 429,
      message: 'Too Many Requests'
    })
  }

  record.count++
})

Infrastructure Security

HTTPS Enforcement

Nuxt handles HTTPS redirects via middleware:

server/middleware/https.ts
export default defineEventHandler((event) => {
  const proto = getRequestHeader(event, 'x-forwarded-proto')

  if (proto === 'http') {
    return sendRedirect(
      event,
      `https://${getRequestHost(event)}${event.path}`,
      301
    )
  }
})

Security Headers

Use nuxt-security to automatically configure security headers following OWASP best practices:

nuxt.config.ts
export default defineNuxtConfig({
  modules: ['nuxt-security'],
  security: {
    headers: {
      crossOriginResourcePolicy: 'same-origin',
      crossOriginOpenerPolicy: 'same-origin',
      xContentTypeOptions: 'nosniff',
      referrerPolicy: 'strict-origin-when-cross-origin',
      contentSecurityPolicy: {
        'default-src': ['\'self\''],
        'script-src': ['\'self\'', '\'unsafe-inline\'']
      }
    }
  }
})

Or configure manually via route rules:

nuxt.config.ts
export default defineNuxtConfig({
  nitro: {
    routeRules: {
      '/**': {
        headers: {
          'X-Frame-Options': 'DENY',
          'X-Content-Type-Options': 'nosniff',
          'Referrer-Policy': 'strict-origin-when-cross-origin',
          ...(process.env.NODE_ENV === 'production' && {
            'Content-Security-Policy': 'default-src \'self\';'
          })
        }
      }
    }
  }
})

Monitoring & Detection

Logging Suspicious Activity

server/middleware/crawler-monitor.ts
export default defineEventHandler((event) => {
  const ua = getRequestHeader(event, 'user-agent')
  const ip = getRequestIP(event)

  // Log suspicious patterns
  if (isSuspiciousPattern(ua, ip)) {
    console.warn(`Suspicious crawler: ${ip} with UA: ${ua}`)
    // Consider blocking or rate limiting
  }
})

Using Web Application Firewalls

Services like Cloudflare or AWS WAF can:

  • Block malicious IPs
  • Prevent DDoS attacks
  • Filter suspicious requests
  • Monitor traffic patterns

Opinion: If you're running a small blog, a WAF is overkill. Add it when you're actually getting attacked.

Common Attacks

Content Scraping

Use nuxt-security rate limiting to prevent automated scraping:

nuxt.config.ts
export default defineNuxtConfig({
  modules: ['nuxt-security'],
  security: {
    rateLimiter: {
      tokensPerInterval: 50,
      interval: 60000
    }
  }
})

For more control over bot detection and delays:

server/middleware/anti-scraping.ts
const requestCounts = new Map<string, number>()

export default defineEventHandler(async (event) => {
  const ip = getRequestIP(event)
  const count = requestCounts.get(ip) || 0

  if (count > 100) {
    throw createError({
      statusCode: 429,
      message: 'Too Many Requests'
    })
  }

  requestCounts.set(ip, count + 1)

  // Add slight delays to automated requests
  const ua = getRequestHeader(event, 'user-agent')
  if (isBot(ua)) {
    await new Promise(r => setTimeout(r, 500))
  }
})

Form Spam

Use nuxt-security for XSS validation and request limiting:

nuxt.config.ts
export default defineNuxtConfig({
  modules: ['nuxt-security'],
  security: {
    xssValidator: true,
    rateLimiter: {
      tokensPerInterval: 5,
      interval: 60000
    }
  }
})

For honeypot fields and custom validation:

server/api/contact.post.ts
const submissionCounts = new Map<string, number>()

export default defineEventHandler(async (event) => {
  const body = await readBody(event)
  const ip = getRequestIP(event)

  // Honeypot check
  if (body.website) { // hidden field
    return { success: false }
  }

  // Rate limiting
  const count = submissionCounts.get(ip) || 0
  if (count > 5) {
    throw createError({
      statusCode: 429,
      message: 'Too many attempts'
    })
  }

  submissionCounts.set(ip, count + 1)

  // Process legitimate submission
  // ...
})