-
Notifications
You must be signed in to change notification settings - Fork 20
Replace flat sitemap llms.txt with curated, AI-optimized index #954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Docs build
|
1 similar comment
Docs build
|
b8bbdab to
695453a
Compare
Deploying documentation with
|
| Latest commit: |
85b8a9d
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://148a2bd3.documentation-21k.pages.dev |
| Branch Preview URL: | https://improve-llm-config.documentation-21k.pages.dev |
b1dcfd9 to
30a73ca
Compare
Docs build
|
1 similar comment
Docs build
|
|
As a follow-up, the pages that are included in the LLMs, should be added a |
5706819 to
3e39252
Compare
|
This is the new export: Few comments:
cc: @Lougarou |
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
||||||||||||||||||||||||
Docs Preview
|
I agree with this, but this would require a more substantial update of the documentation. Not against it – let's decide on which pages we want to include in the file, once approved, I can update them to include the |
Docs build
|
User description
Replace flat sitemap
llms.txtwith curated, AI-optimized indexSummary
This PR replaces the auto-generated flat sitemap
llms.txtwith a curated, categorized index optimized for LLM consumption. Both versions are generated via@vuepress/plugin-llms— the difference is that we now supply custom template getters (getLlmsPluginOptions()) to filter, deduplicate, organize, and describe pages instead of dumping everything into a single flat list.The exhaustive dump is preserved as
llms-full.txtand cross-referenced from the new file's header. Per-page markdown (llms-page.txt) continues to generate as before.Why this matters
When an LLM (ChatGPT, Claude, Copilot, etc.) ingests
llms.txt— whether via an agent fetch, RAG pipeline, or IDE integration — the file's quality directly determines whether the model can find the right content, for the right version, without wasting context window tokens.The old file made this nearly impossible. The new file makes it straightforward.
What changed
Implementation
A new
getLlmsPluginOptions()function provides custom template getters for 15 sections of the curatedllms.txt. The code is organized around four reusable factory helpers:createSlugOrderSectioncreatePrefixSectioncreateFilterSectioncreateFilterSlugOrderSectionAdditional helpers:
matchIndexSlug— reusable predicate for matching index/overview pages (e.g.,/quick-start/or/quick-start/index)getPageDescription— extracts description from frontmatter or auto-excerptnormalizeIndexUrl— fixes broken/.mdURLs to/index.mdKey design decisions:
versioning.latestdetermines the server prefix; the Kubernetes operator version is resolved fromversioning.versions. No hardcoded version strings.createFilterSectioncan pass asortBycomparator for deterministic output regardless of VuePress page order.http-apipaths, so this page appears only under APIs.The plugin is configured with all three output modes:
Output comparison at a glance
llms.txtllms.txt## Table of Contents)versioning.latest)/server/v26.0/...)https://docs.kurrent.io/...)llms-full.txtcross-referencesortBySection order
Sections are ordered by importance for LLM comprehension — understanding what the product is, then how to use it, then how to operate it:
Detailed comparison
Before: flat dump, no version awareness
Problems:
After: curated, categorized, described
How the generation works
Description priority: frontmatter
description→ auto-excerpt → omitted. Descriptions improve automatically as authors adddescriptionto page frontmatter.Scenario walkthrough
"How do I append events using the Python client?"
Python – Appending eventswith description"How do I get started with Kurrent Cloud?"
"How do I deploy KurrentDB on Kubernetes?"
Limitations and known issues
1. Descriptions are auto-generated excerpts, not hand-written summaries
Most descriptions come from VuePress's auto-excerpt (~first 200 characters). This causes:
Auto-Scavenge: Auto-Scavenge,Redaction: Redaction...using thi...Backup and restore: Backup and restore Backing up...Kafka Sink,Elasticsearch Sink(no description)Mitigation: The generator prioritizes frontmatter
description. Adding it to any page immediately improves the output:2. Community section is hardcoded
Community links are static strings rather than VuePress-derived data. Fine for rarely-changing URLs, but requires a code change if any URL changes.
Recommended follow-ups
descriptionto ~25 pages with empty/echo descriptionsPR Type
Enhancement
Description
Replace flat sitemap with curated, AI-optimized llms.txt index
Organize documentation into 15 categorized sections by importance
Reduce entry count from 500+ to ~100 links with descriptions
Use dynamic version resolution and custom template getters
Diagram Walkthrough
File Walkthrough
config.ts
Wire up custom LLMs plugin configurationdocs/.vuepress/config.ts
getLlmsPluginOptionsfunction from llms configllmsPlugin()instead of empty configllms.ts
Implement curated LLMs plugin configuration with section gettersdocs/.vuepress/configs/llms.ts
configuration, etc.)
createPrefixSection,createSlugOrderSection,createFilterSection,createFilterSlugOrderSectionobject