Skip to content

Comments

feat(website): Auto-generate llms.txt and llms-full.txt#6247

Merged
Xuanwo merged 3 commits intoapache:mainfrom
kingsword09:docs/llms-txt-generation
Jun 8, 2025
Merged

feat(website): Auto-generate llms.txt and llms-full.txt#6247
Xuanwo merged 3 commits intoapache:mainfrom
kingsword09:docs/llms-txt-generation

Conversation

@kingsword09
Copy link
Contributor

Which issue does this PR close?

Closes #5895.

Rationale for this change

What changes are included in this PR?

  • Added and configured my custom docusaurus-plugin-llms-builder plugin
  • llms.txt and ‎llms-full.txt are now auto-generated by docusaurus build

But currently, I'm using the content from README.md for the title, description, and details, which might not be comprehensive enough.

Are there any user-facing changes?

@kingsword09 kingsword09 requested a review from Xuanwo as a code owner June 2, 2025 00:00
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. releases-note/feat The PR implements a new feature or has a title that begins with "feat" labels Jun 2, 2025
@Xuanwo
Copy link
Member

Xuanwo commented Jun 2, 2025

Thank you so much for this! We have many externally generated documents, such as rust documentation created by rustdoc. Is this plugin able to generate those as well?

@kingsword09
Copy link
Contributor Author

kingsword09 commented Jun 2, 2025

Thank you so much for this! We have many externally generated documents, such as rust documentation created by rustdoc. Is this plugin able to generate those as well?

The llms.txt was generated based on the URLs in the sitemap.xml from the website.

@Xuanwo
Copy link
Member

Xuanwo commented Jun 2, 2025

The llms.txt was generated based on the URLs in the sitemap.xml from the website.

Hi, that doesn't seem very useful since all the important content is located at links like https://opendal.apache.org/docs/rust/opendal/, which isn't included in the sitemap.xml.

@kingsword09
Copy link
Contributor Author

kingsword09 commented Jun 2, 2025

https://opendal.apache.org/docs/rust/opendal/

The llms.txt was generated based on the URLs in the sitemap.xml from the website.

Hi, that doesn't seem very useful since all the important content is located at links like https://opendal.apache.org/docs/rust/opendal/, which isn't included in the sitemap.xml.

If we add the content from llms-full.txt with https://docs.rs/about/rustdoc-json, would that suffice? And add https://opendal.apache.org/docs/rust/opendal/ to llms.txt.

However, currently https://docs.rs/crate/opendal/latest/json doesn't seem to be fully covered yet. Once this link becomes active, I noticed it contains a docs field, which seems quite suitable to be added to llms-full.txt.

image

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jun 6, 2025
Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crates-llms-txt is truly impressive! Let's go and to see how it works.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 6, 2025
@Xuanwo
Copy link
Member

Xuanwo commented Jun 6, 2025

The website built failed for
image

@kingsword09
Copy link
Contributor Author

The website built failed for image

My local macbook is normal, but I don't know why this one reports this error: Error: Cannot find module 'crates-llms-txt-napi-linux-x64-gnu', npm on this package is clearly posted: crates-llms-txt-napi-linux -x64-gnu 😂

@kingsword09 kingsword09 force-pushed the docs/llms-txt-generation branch from 828b909 to 12df2a6 Compare June 7, 2025 16:46
@kingsword09 kingsword09 force-pushed the docs/llms-txt-generation branch from 12df2a6 to f1e529b Compare June 8, 2025 00:52
@kingsword09
Copy link
Contributor Author

The website built failed for image

My local macbook is normal, but I don't know why this one reports this error: Error: Cannot find module 'crates-llms-txt-napi-linux-x64-gnu', npm on this package is clearly posted: crates-llms-txt-napi-linux -x64-gnu 😂

The Node version requirement for the crates-llms-txt package was set too high previously.

@Xuanwo Xuanwo merged commit 5f89180 into apache:main Jun 8, 2025
247 checks passed
@Xuanwo
Copy link
Member

Xuanwo commented Jun 8, 2025

Thank you!

@kingsword09 kingsword09 deleted the docs/llms-txt-generation branch June 8, 2025 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer releases-note/feat The PR implements a new feature or has a title that begins with "feat" size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add llms.txt to make opendal more LLM friendly

2 participants