Add llms.txt for machine-readable docs index by dariye · Pull Request #313 · activeagents/activeagent

dariye · 2026-02-12T11:50:56Z

Summary

Adds /llms.txt — a machine-readable index of all 34 documentation pages following the llms.txt spec
Generated at build time via VitePress buildEnd hook — npm run docs:build produces HTML + llms.txt in one command
Adds docs/llms_txt.md — docs page explaining llms.txt, linked in the Contributing sidebar section
Adds CI validation step after build to verify llms.txt content (entry count, sections, format)
Adds <link rel="help"> tag in HTML head pointing to /llms.txt

How it works

The generation logic lives in docs/.vitepress/llms-txt.ts, exported as generateLlmsTxt and wired into config.mts as buildEnd: generateLlmsTxt. It reads frontmatter title: and description: from each .md file using regex (no npm dependencies added). On every docs deploy, npm run docs:build generates llms.txt directly into the dist output directory.

No separate script or npm command — VitePress build does everything
No committed artifact — llms.txt is generated, not checked in
CI validates output — a post-build step checks file existence, H1, entry count (>=30), and all 7 sections

Preview of generated llms.txt

# Active Agent

> ActiveAgent extends Rails MVC to AI interactions. Build intelligent agents using familiar patterns — controllers, actions, callbacks, and views. The AI framework for Rails with less code & more fun.

## Getting Started

- [Getting Started](https://docs.activeagents.ai/getting_started): Build AI agents with Rails in minutes. Learn how to install, configure, and create your first agent.

## Framework

- [Active Agent](https://docs.activeagents.ai/framework): ActiveAgent extends Rails MVC to AI interactions. Build intelligent agents using familiar patterns—controllers, actions, callbacks, and views.
- [Agents](https://docs.activeagents.ai/agents): Controllers for AI interactions with actions, callbacks, views, and concerns that generate AI responses instead of rendering HTML.
- [Providers](https://docs.activeagents.ai/providers): Connect your agents to AI services through a unified interface. Switch between OpenAI, Anthropic, local models, or testing mocks without changing agent code.
- [Configuration](https://docs.activeagents.ai/framework/configuration): Flexible configuration for framework-level settings and provider-specific options. Configure retry strategies, logging, and multiple AI providers with environment-specific settings.
- [Instrumentation and Logging](https://docs.activeagents.ai/framework/instrumentation): Monitor provider operations using ActiveSupport::Notifications. Track performance metrics, debug generation flows, and integrate with external monitoring services.
- [Retries](https://docs.activeagents.ai/framework/retries): Automatic retry mechanisms for handling rate limits, timeouts, and transient errors using provider-native SDK retry strategies with exponential backoff.
- [Rails Integration](https://docs.activeagents.ai/framework/rails): Install ActiveAgent in Rails applications with generators for agents, actions, and views. Configure providers and leverage familiar Rails conventions.
- [Testing ActiveAgent Applications](https://docs.activeagents.ai/framework/testing): Testing strategies for ActiveAgent applications with credential management, VCR integration, and test patterns.

## Agents

- [Actions](https://docs.activeagents.ai/actions): Public methods in your agent that define specific AI behaviors using prompt() for text generation or embed() for vector embeddings.
- [Generation](https://docs.activeagents.ai/agents/generation): Execute AI generations synchronously with prompt_now or asynchronously with prompt_later using ActiveAgent's generation methods.
- [Agent Instructions](https://docs.activeagents.ai/agents/instructions): System-level messages that guide agent behavior, personality, capabilities, and tool usage. The agent's operating manual for every interaction.
- [Streaming](https://docs.activeagents.ai/agents/streaming): Stream responses from AI providers in real-time using callbacks that execute at different points in the streaming lifecycle.
- [Callbacks](https://docs.activeagents.ai/agents/callbacks): Control agent lifecycle with generation, prompting, embedding, and streaming callbacks for setup, validation, cleanup, and real-time response handling.
- [Error Handling](https://docs.activeagents.ai/agents/error_handling): Build resilient agents with automatic retries for network failures and application-level rescue handlers for custom error recovery.

## Actions

- [Messages](https://docs.activeagents.ai/actions/messages): Build conversation context with messages containing roles (user, assistant, system, tool) and content (text, images, documents) in native or unified format.
- [Embeddings](https://docs.activeagents.ai/actions/embeddings): Generate vector embeddings from text to enable semantic search, clustering, and similarity comparison in your AI applications.
- [Tools](https://docs.activeagents.ai/actions/tools): Extend agents with callable functions that LLMs can trigger during generation. Unified interface across providers for function calling.
- [Model Context Protocols (MCP)](https://docs.activeagents.ai/actions/mcps): Connect agents to external services and APIs using the Model Context Protocol. Universal integration for tools and data sources.
- [Structured Output](https://docs.activeagents.ai/actions/structured_output): Control JSON responses from AI models with json_object for simple output or json_schema for validated structured data.
- [Usage Statistics](https://docs.activeagents.ai/actions/usage): Track token usage and performance metrics across all AI providers with normalized usage objects.

## Providers

- [Anthropic Provider](https://docs.activeagents.ai/providers/anthropic): Integration with Claude models including Sonnet 4.5, Haiku 4.5, and Opus 4.1. Advanced reasoning, extended context windows, thinking mode, and strong performance on complex tasks.
- [Ollama Provider](https://docs.activeagents.ai/providers/ollama): Local LLM inference using Ollama platform. Run Llama 3, Mistral, and Gemma locally without external APIs. Perfect for privacy-sensitive applications and development.
- [OpenAI Provider](https://docs.activeagents.ai/providers/open_ai): Integration with GPT models including GPT-5, GPT-4.1, GPT-4o, and o3. Responses API with built-in tools or traditional Chat Completions API for standard interactions.
- [OpenRouter Provider](https://docs.activeagents.ai/providers/open_router): Access 200+ AI models from multiple providers through unified API. Intelligent routing, automatic fallbacks, multimodal support, PDF processing, and cost optimization.
- [Mock Provider](https://docs.activeagents.ai/providers/mock): Testing provider for developing and testing agents without API calls or costs. Returns predictable pig latin responses and generates random embeddings.

## Examples

- [Browser Use Agent](https://docs.activeagents.ai/examples/browser-use-agent): Browser automation with AI-driven control. Navigate web pages, interact with elements, extract content, and take screenshots using Cuprite/Chrome.
- [Data Extraction](https://docs.activeagents.ai/examples/data_extraction_agent): Extract structured data from PDF resumes using AI-powered parsing. Demonstrates multimodal input and structured output with JSON schemas.
- [MCP Integration Agent](https://docs.activeagents.ai/examples/mcp-integration-agent): Connect ActiveAgent with external services through Model Context Protocol. Demonstrates standardized integration with cloud storage, APIs, and custom services.
- [Research Agent](https://docs.activeagents.ai/examples/research-agent): Combine multiple tools and data sources for comprehensive research tasks. Integrates web search, MCP servers, and image generation for powerful research workflows.
- [Support Agent](https://docs.activeagents.ai/examples/support-agent): Customer support chatbot demonstrating core ActiveAgent concepts including tool calling, message context, and multimodal responses.
- [Translation Agent](https://docs.activeagents.ai/examples/translation-agent): Create specialized agents for language translation tasks. Demonstrates how to build focused, single-purpose agents with clear responsibilities.
- [Web Search Agent](https://docs.activeagents.ai/examples/web-search-agent): Web search capabilities through OpenAI's search models and tools. Access real-time web information using Chat Completions API or Responses API.

## Contributing

- [Documentation](https://docs.activeagents.ai/contributing/documentation): Deterministic, always-accurate documentation where every code example comes from tested files. Learn how to maintain documentation that can't drift from code.

Test plan

npm run docs:build — completes and logs llms.txt generation with 34 entries
docs/.vitepress/dist/llms.txt has correct content, no self-referential entry
CI validation step — 34 entries, all 7 sections present, H1 correct
Verify https://docs.activeagents.ai/llms.txt serves correctly after deploy
Verify /llms_txt page renders in VitePress

🤖 Generated with Claude Code

Adds an llms.txt file following the llms.txt spec, providing AI tools with a structured index of all 35 documentation pages. Includes a Node generator script, test suite, docs page, CI integration, and sidebar link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move llms.txt generation from a standalone script into the VitePress buildEnd hook so docs:build produces everything in one command. - Extract generation logic to docs/.vitepress/llms-txt.ts - Delete scripts/generate-llms-txt.mjs and docs/public/llms.txt - Remove generate:llms-txt npm script and CI step - Update test to build docs and check dist/llms.txt - Update docs page with new regeneration instructions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The llms.txt page explains what llms.txt is — that's for humans, not LLMs consuming the file. Omit it to avoid the circular reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Delete scripts/test-llms-txt.mjs — a bespoke Node script with hand-rolled assertions outside the project's test conventions (Ruby Minitest via bin/test). Add inline validation in docs.yml after docs:build instead. This runs where it belongs: in the pipeline that produces the artifact, checking file existence, H1, entry count, and sections. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Adds support for publishing an llms.txt machine-readable documentation index as part of the VitePress docs build, documents the feature, and validates the generated output in CI.

Changes:

Add a new docs page describing llms.txt and how it’s generated/regenerated.
Generate llms.txt at VitePress build time via a buildEnd hook.
Add CI validation for the generated llms.txt and add an HTML <link rel="help"> reference.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
`docs/llms_txt.md`	Adds a documentation page explaining `llms.txt` and how to use/regenerate it.
`docs/.vitepress/llms-txt.ts`	Implements build-time generation of `llms.txt` from a curated list of docs pages.
`docs/.vitepress/config.mts`	Wires generation into `buildEnd`, adds `<link rel="help">`, and links the new docs page in the sidebar.
`.github/workflows/docs.yml`	Adds a post-build CI step to validate `llms.txt` existence/format/sections/entry count.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-12T18:28:58Z

docs/.vitepress/config.mts

    ['meta', { property: 'og:type', content: 'website' }],
-    ['script', { async: '', defer: '', src: 'https://buttons.github.io/buttons.js' }]
+    ['script', { async: '', defer: '', src: 'https://buttons.github.io/buttons.js' }],
+    ['link', { rel: 'help', type: 'text/markdown', href: '/llms.txt', title: 'LLM Documentation' }]


The new <link rel="help"> uses an absolute href: '/llms.txt', which bypasses VitePress base when building versioned docs (or any non-root deployment). This can produce a broken link in the rendered HTML head for non-root bases; consider prefixing with the configured base (or using VitePress's withBase helper) so the link resolves correctly in all builds.

Copilot · 2026-02-12T18:28:58Z

docs/.vitepress/llms-txt.ts

+      const title = fm.title || page.path
+      const desc = fm.description || ''
+      const url = `${BASE_URL}/${page.path}`
+
+      lines.push(`- [${title}](${url}): ${desc}`)


llms.txt URLs are generated as ${BASE_URL}/${page.path} without considering the VitePress base (used for versioned builds). This means versioned builds will emit an llms.txt whose links point at the unversioned pages instead of the built site’s actual paths. Consider incorporating siteConfig.site.base into the generated URLs, or skipping generation when base is not / if only the root site should expose llms.txt.

Copilot · 2026-02-12T18:28:58Z

docs/.vitepress/llms-txt.ts

+  }
+
+  const outPath = join(siteConfig.outDir, 'llms.txt')
+  writeFileSync(outPath, lines.join('\n'))


writeFileSync(outPath, lines.join('\n')) writes the file without a trailing newline. Some tooling expects text files to end with a newline; consider ensuring a final \n at EOF when writing llms.txt.

Suggested change

writeFileSync(outPath, lines.join('\n'))

writeFileSync(outPath, lines.join('\n') + '\n')

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-13T00:37:42Z

.github/workflows/docs.yml

+          ENTRIES=$(grep -c "^- \[" "$FILE")
+          test "$ENTRIES" -ge 30 || { echo "FAIL: only $ENTRIES entries (expected >=30)"; exit 1; }
+          for section in "Getting Started" "Framework" "Agents" "Actions" "Providers" "Examples" "Contributing"; do
+            grep -q "^## $section" "$FILE" || { echo "FAIL: missing section '$section'"; exit 1; }
+          done
+          echo "llms.txt valid: $ENTRIES entries, all sections present"


CI validation only enforces ENTRIES >= 30, so a partial/incorrect llms.txt could still pass (especially since generation currently can skip missing pages). Since the generator’s page list is fixed, consider asserting the exact expected entry count (or otherwise validating every expected URL/title) to reliably catch regressions.

Suggested change

ENTRIES=$(grep -c "^- \[" "$FILE")

test "$ENTRIES" -ge 30 || { echo "FAIL: only $ENTRIES entries (expected >=30)"; exit 1; }

for section in "Getting Started" "Framework" "Agents" "Actions" "Providers" "Examples" "Contributing"; do

grep -q "^## $section" "$FILE" || { echo "FAIL: missing section '$section'"; exit 1; }

done

echo "llms.txt valid: $ENTRIES entries, all sections present"

EXPECTED_ENTRIES=30

ENTRIES=$(grep -c "^- \[" "$FILE")

test "$ENTRIES" -eq "$EXPECTED_ENTRIES" || { echo "FAIL: $ENTRIES entries found (expected exactly $EXPECTED_ENTRIES)"; exit 1; }

for section in "Getting Started" "Framework" "Agents" "Actions" "Providers" "Examples" "Contributing"; do

grep -q "^## $section" "$FILE" || { echo "FAIL: missing section '$section'"; exit 1; }

done

echo "llms.txt valid: $ENTRIES entries (expected $EXPECTED_ENTRIES), all sections present"

Copilot · 2026-02-13T00:37:43Z

docs/llms_txt.md

+Active Agent publishes an [`llms.txt`](/llms.txt) file — a machine-readable index of all documentation pages, following the [llms.txt specification](https://llmstxt.org).
+
+## What is llms.txt?
+
+The llms.txt spec provides a standard way for websites to offer documentation in a format optimized for large language models. Instead of crawling HTML pages, AI tools can fetch a single markdown file with structured links and descriptions for every page.


This page states llms.txt is an index of “all documentation pages”, but the generator currently uses a fixed allowlist and excludes some pages (e.g. docs/index.md, and intentionally excludes this page). Consider either updating the generator to truly cover all pages or clarifying here which pages are intentionally omitted.

Copilot · 2026-02-13T00:37:43Z

docs/.vitepress/llms-txt.ts

+      } catch {
+        console.warn(`  skip: ${page.path}.md (not found)`)
+        continue


generateLlmsTxt silently skips missing pages (catch { console.warn(...); continue }). Since the list of pages is hard-coded, a missing/renamed doc file likely indicates a broken llms.txt and should fail the build/CI rather than producing a partial index that may still pass validation.

Suggested change

} catch {

console.warn(` skip: ${page.path}.md (not found)`)

continue

} catch (error) {

throw new Error(

`Failed to read or parse frontmatter for ${page.path}.md at ${filePath}: ${(error as Error).message}`,

)

Copilot · 2026-02-13T00:37:43Z

docs/.vitepress/llms-txt.ts

+const sections = [
+  {
+    title: 'Getting Started',
+    pages: [{ path: 'getting_started' }],
+  },
+  {


The generator hard-codes the list of pages in sections, but the repo contains other top-level pages (e.g. docs/index.md) that are not included. If the goal is “index of all documentation pages”, consider generating the list from the VitePress page data / filesystem (and explicitly excluding only pages you don’t want, like llms_txt.md).

dariye marked this pull request as draft February 12, 2026 12:21

dariye and others added 3 commits February 12, 2026 13:22

Remove self-referential llms_txt entry from llms.txt output

d3c54c0

The llms.txt page explains what llms.txt is — that's for humans, not LLMs consuming the file. Omit it to avoid the circular reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

dariye marked this pull request as ready for review February 12, 2026 12:54

TonsOfFun requested a review from Copilot February 12, 2026 18:24

Copilot started reviewing on behalf of TonsOfFun February 12, 2026 18:24 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

TonsOfFun requested a review from Copilot February 13, 2026 00:30

Copilot started reviewing on behalf of TonsOfFun February 13, 2026 00:31 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add llms.txt for machine-readable docs index#313

Add llms.txt for machine-readable docs index#313
dariye wants to merge 4 commits intoactiveagents:mainfrom
dariye:add-llms-txt

dariye commented Feb 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Copilot AI Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	writeFileSync(outPath, lines.join('\n'))
	writeFileSync(outPath, lines.join('\n') + '\n')

Uh oh!

Conversation

dariye commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dariye commented Feb 12, 2026 •

edited

Loading