Skip to content

Implementation of Vale#3022

Merged
quaquel merged 13 commits intomesa:mainfrom
vedantvakharia:vale-final
Jan 7, 2026
Merged

Implementation of Vale#3022
quaquel merged 13 commits intomesa:mainfrom
vedantvakharia:vale-final

Conversation

@vedantvakharia
Copy link
Copy Markdown
Contributor

Summary

Closes #1572
This PR implements Vale, a syntax-aware linter for prose, to ensure consistent documentation style across the project. It includes configurations for GitHub Actions (CI) and local pre-commit hooks.

Implementation Details

1. Configuration (.vale.ini)

  • Base Style: Implements the Google style guide.
    • Reasoning: Google's guide was chosen over Microsoft's as it focuses more on technical clarity and developer documentation, whereas Microsoft's rules can be overly corporate/verbose for this context.
  • Modifications:
    • Disabled Vale's internal spell checker to avoid redundancy (since Mesa already uses codespell).
    • Disabled specific Google rules that felt too strict for the current documentation state.
    • Severity: Google style alerts are set to suggestion (non-blocking) so they do not prevent commits.

2. Custom Project Rules (.github/styles/mesa.yml)

  • Added a dedicated style file for project-specific terminology.
  • Current Status: The rules are currently commented out/placeholders.
  • Behavior: These rules are configured to trigger error (blocking) level alerts in pre-commit, ensuring critical terminology (e.g., "Mesa" vs "mesa") is enforced strictly once enabled.

3. Workflow Integration

  • Pre-commit: Added Vale to .pre-commit-config.yaml for local linting.
  • CI: Added .github/workflows/vale.yml to run checks on Pull Requests.
  • Gitignore: Added Google styles to .gitignore to prevent cluttering the repository with external dependency files.

4. Exclusions

  • Excluded HISTORY.md and CODE_OF_CONDUCT.md from linting to preserve historical logs and standard community text.

Questions for Reviewers

  1. Custom Rules: mesa.yml is currently a placeholder. Are there specific terms (besides "Mesa" and "NumPy") you would like strictly enforced immediately?
  2. Workflow: I have enabled Vale in both CI and pre-commit. Please let me know if you prefer to restrict it to CI only.

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Dec 29, 2025

Thanks for this PR. I am curious about @EwoutH take on this. But as is clear from some of the changed, files, it does seem helpful in fixing basic typo's.

@quaquel quaquel closed this Dec 29, 2025
@quaquel quaquel reopened this Dec 29, 2025
Copy link
Copy Markdown
Member

@EwoutH EwoutH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I see the value in such a solution. Let’s see if we can get the complexity a bit down.

A few questions in comments below.


- name: Install Vale
run: |
wget https://github.com/errata-ai/vale/releases/download/v3.0.7/vale_3.0.7_Linux_64-bit.tar.gz
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t like hardcoded versions. Are there alternatives?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @EwoutH! I've addressed your comments:
Hardcoded version: Replaced manual installation with official errata-ai/[email protected]

Let me know if you'd like any other adjustments!

- id: codespell

# Prose linting with Vale
- repo: https://github.com/errata-ai/vale
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we both need to be in GitHub actions and Pre-commit? If not, I prefer only being in pre-commit.

Copy link
Copy Markdown
Contributor Author

@vedantvakharia vedantvakharia Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a technical catch with using only pre-commit.ci:

The pre-commit.ci bot runs in an isolated environment and cannot run vale sync to download the external Google styles. This causes Runtime Error unless the styles are present locally.

If we want to use only pre-commit, I will need to commit the entire Google style definition folder (inside .github/styles/Google) directly into the repository so the bot can access them.

Why I chose the Action: I saw that in PR #2568, there was hesitation about adding the Microsoft style files directly to the repo. To respect that preference and keep the repo clean, I used a GitHub Action to download the styles on the fly, rather than committing them.

Are you comfortable with committing the Google style files (~33 YAML files) to the repository? If so, I can switch to pre-commit only. Otherwise, the GitHub Actions workflow provides a cleaner solution.
Let me know your preference!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a GitHub Action can dynamically load them while pre-commit cannot?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.

  • GitHub Actions allows us to run shell commands, so the workflow I added explicitly runs vale sync to download the styles before the check starts.
  • pre-commit.ci is a stricter, sandboxed environment. It does not run arbitrary setup commands (like vale sync) before the hooks. It expects all configuration files to be present in the repository itself.

That is why the Action works 'out of the box' while pre-commit.ci fails unless we commit the style files.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so in our case, what does vale in pre-commit do?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our current setup, Vale in pre-commit acts as an immediate feedback tool. Provided the developer has run vale sync once on their machine, pre-commit catches style errors before they push code. This saves them from waiting for CI to fail. We keep the pre-commit hook solely for the benefit of local developers (so they can lint while they code). We rely on the GitHub Action to perform the actual "official" check on the server.

@@ -1,6 +1,9 @@
<!-- vale off -->
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t understand how vale is off here, but there are still changes in the document

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first made vale run through the file to identify any spelling errors in the file. Then I turned vale off for that file to avoid noise from false positives, specifically regarding spell checking.

The Issue:
Vale's default spell checker treats technical terms (e.g., variable names, module paths, library-specific jargon) as spelling errors if they aren't in its standard dictionary. Since HISTORY.md is a log of code changes, it is naturally full of these terms.

What I tried first:
I initially attempted to fix this by building a project-specific vocabulary (accept.txt), adding valid technical terms to it so Vale would recognize them.

Why that didn't work:
I quickly found that the number of unique variable names and technical terms in the history log is massive. Maintaining an accept.txt list for every single new parameter or method mentioned in the changelog would become a significant maintenance burden (the file would grow indefinitely and require constant updates).

Since we already use codespell (which is better tuned for codebases), I decided it was cleaner to rely on codespell for catching typos in these specific files and use Vale purely for enforcing style/prose in the documentation, where it shines. If you need vale to be enabled in those 2 files, I can make changes to the file and enable it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to limit it to certain folders?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can certainly do that.

Currently, the configuration targets the docs/ folder AND *.md (files in the root directory).

If we limit Vale strictly to the docs/ folder:

  • Pro: It automatically ignores HISTORY.md and CODE_OF_CONDUCT.md without needing any exclude rules.
  • Con: It would stop linting the main README.md of the project (since it sits in the root), and that is often the most visible file for new users.

I stuck with the current approach (checking root files + manual excludes) specifically to keep the main README.md covered.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s start in this PR with /docs only, and then we can discuss if we want to extend it later.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the PR to include only the changes in /docs.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EwoutH, can you confirm that this PR is now ready to be merged?

@quaquel quaquel added the docs Release notes label label Dec 30, 2025
Copy link
Copy Markdown
Member

@EwoutH EwoutH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I still have some doubts about the proportionality (added value vs increased complexity), it's a relatively unintrusive, standalone implementation that could be easily reverted. So lets give it the benefit of the doubt.

@quaquel quaquel merged commit 4b3bd13 into mesa:main Jan 7, 2026
15 checks passed
@quaquel quaquel added the ci Release notes label label Jan 7, 2026
@EwoutH
Copy link
Copy Markdown
Member

EwoutH commented Jan 11, 2026

Running pre-commit locally, I get:

image

What do I need to do to resolve this?

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Jan 11, 2026

my hunch is that some vale init needs to be run locally to download the relevant styles.

@vedantvakharia
Copy link
Copy Markdown
Contributor Author

Hi @EwoutH, thanks for testing this locally!

The error style 'Google' does not exist happens because I haven't committed the Google style definitions to the repository to keep our git history clean and avoid bloating the repo with upstream style files.

To fix this, you just need to fetch the styles once. Please run:

vale sync

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Jan 12, 2026

Might it be an idea to update contributing.md to explicitly explain this?

@EwoutH
Copy link
Copy Markdown
Member

EwoutH commented Jan 12, 2026

Good idea, agreed.

@vedantvakharia could you add a small snippet under https://github.com/mesa/mesa/blob/main/CONTRIBUTING.md#testing-and-code-standards?

@EwoutH
Copy link
Copy Markdown
Member

EwoutH commented Feb 6, 2026

While I appreciate the effort, I would suggest revering this PR. pre-commit is already complex, especially for new contributors, and this makes it even more complex. I don't think that complexity weights up against the limited value vale adds.

Maybe we can run it separately, in an isolated GitHub action, once a month or so.

@quaquel
Copy link
Copy Markdown
Member

quaquel commented Feb 6, 2026

I don't understand your point. I haven't run into any issues with vale yet in precommit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci Release notes label docs Release notes label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add prose linter to CI

3 participants