Skip to content

Conversation

@jacobtomlinson
Copy link
Member

@jacobtomlinson jacobtomlinson commented Dec 10, 2025

Adds documentation support for llms.txt via the sphinx-llm extension that I wrote.

In addition to building HTML pages, a markdown version of the documentation is also built for consumption by LLMs. This reduces the amount of context window space LLMs need to use by removing HTML/CSS/JavaScript markup in favour of more streamlined markdown.

  • For each documentation page a markdown version is built using the sphinx-markdown-builder (you can append .md to any documentation URL to view this)
  • An llms.txt file file is generated which acts as a sitemap for LLMs to use to discover pages
  • An llms-full.txt file is generated with the entire docs contained in a single markdown file to allow loading all of the Dask docs into context with a single HTTP request (which for Dask would use 1.1M input tokens 🙀)

Build preview highlights

Page HTML Markdown
llms.txt NA Markdown
10 minutes to dask HTML Markdown
API docs example (dask.array.angle) HTML Markdown

@github-actions
Copy link
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      9 files  ±0        9 suites  ±0   3h 15m 40s ⏱️ + 2m 56s
 18 159 tests ±0   16 944 ✅ ±0   1 215 💤 ±0  0 ❌ ±0 
162 568 runs  ±0  150 560 ✅ ±0  12 008 💤 ±0  0 ❌ ±0 

Results for commit a6e4365. ± Comparison against base commit 068be28.

@dcherian
Copy link
Collaborator

Nice, we have merged support for it in Xarray. Thanks for writing the plugin!

@jacobtomlinson
Copy link
Member Author

jacobtomlinson commented Dec 10, 2025

@dcherian actually it looks like you're using the other plugin. There's a comparison table in the one I wrote. The most notable thing is that the other one doesn't actually build the docs, it just passes the source straight through. My plugin runs a full parallel sphinx build so that all extensions and directives are processed. This is super important for things like intersphinx, docrefs and autoapi.

Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jacobtomlinson!

@jrbourbeau jrbourbeau merged commit 2497ebe into dask:main Dec 10, 2025
23 of 25 checks passed
@jacobtomlinson jacobtomlinson deleted the sphinx-llm branch December 11, 2025 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants