Nuxt AI Ready outputs clean markdown optimized for vectorizing. This guide shows how to build a RAG pipeline using llms-full.txt.
llms-full.txt contains all pages as markdown, separated by --- dividers with frontmatter:
const response = await fetch('https://yoursite.com/llms-full.txt')
const content = await response.text()
// Split into pages
const pages = content.split(/^---$/m).filter(Boolean).map((block) => {
const [, frontmatter, ...rest] = block.split(/^---$/m)
const markdown = rest.join('---').trim()
// Parse frontmatter
const meta: Record<string, string> = {}
frontmatter?.split('\n').forEach((line) => {
const [key, ...val] = line.split(':')
if (key?.trim())
meta[key.trim()] = val.join(':').trim()
})
return { ...meta, markdown }
})
Use any embedding provider. Example with OpenAI:
import OpenAI from 'openai'
const openai = new OpenAI()
async function embed(text: string) {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text,
})
return response.data[0].embedding
}
// Embed each page
const vectors = await Promise.all(
pages.map(async page => ({
id: page.route,
embedding: await embed(page.markdown),
metadata: { title: page.title, route: page.route }
}))
)
import Database from 'better-sqlite3'
import * as sqliteVec from 'sqlite-vec'
const db = new Database(':memory:')
sqliteVec.load(db)
db.exec(`
CREATE VIRTUAL TABLE pages USING vec0(
id TEXT PRIMARY KEY,
embedding FLOAT[1536]
)
`)
const insert = db.prepare('INSERT INTO pages VALUES (?, ?)')
for (const v of vectors) {
insert.run(v.id, new Float32Array(v.embedding))
}
import { Index } from '@upstash/vector'
const index = new Index()
await index.upsert(vectors.map(v => ({
id: v.id,
vector: v.embedding,
metadata: v.metadata
})))
async function search(query: string, topK = 5) {
const queryEmbedding = await embed(query)
// sqlite-vec
const results = db.prepare(`
SELECT id, distance
FROM pages
WHERE embedding MATCH ?
ORDER BY distance
LIMIT ?
`).all(new Float32Array(queryEmbedding), topK)
return results
}
// Use in RAG prompt
const relevant = await search('how do I configure meta tags?')
const context = relevant.map(r => pages.find(p => p.route === r.id)?.markdown).join('\n\n')
By default, each page is one chunk. For large pages, split by heading:
function chunkByHeading(markdown: string, route: string) {
const sections = markdown.split(/^##\s+/m)
return sections.map((section, i) => ({
id: `${route}#${i}`,
content: section.trim(),
route
}))
}
| Strategy | When to use |
|---|---|
| Page-level | Small pages (<2k tokens), general search |
| Heading-level | Long docs, precise retrieval needed |
| Sliding window | Dense technical content, overlap matters |
Run vectorization at build time:
// scripts/vectorize.ts
import { readFileSync } from 'node:fs'
const llmsFull = readFileSync('.output/public/llms-full.txt', 'utf-8')
// ... parse and vectorize as above
Add to your build:
{
"scripts": {
"generate": "nuxt generate && tsx scripts/vectorize.ts"
}
}