Conversation
|
Thanks for this PR. I am curious about @EwoutH take on this. But as is clear from some of the changed, files, it does seem helpful in fixing basic typo's. |
EwoutH
left a comment
There was a problem hiding this comment.
Thanks for the PR. I see the value in such a solution. Let’s see if we can get the complexity a bit down.
A few questions in comments below.
.github/workflows/vale.yml
Outdated
|
|
||
| - name: Install Vale | ||
| run: | | ||
| wget https://github.com/errata-ai/vale/releases/download/v3.0.7/vale_3.0.7_Linux_64-bit.tar.gz |
There was a problem hiding this comment.
I don’t like hardcoded versions. Are there alternatives?
There was a problem hiding this comment.
Thanks for the feedback @EwoutH! I've addressed your comments:
Hardcoded version: Replaced manual installation with official errata-ai/[email protected]
Let me know if you'd like any other adjustments!
| - id: codespell | ||
|
|
||
| # Prose linting with Vale | ||
| - repo: https://github.com/errata-ai/vale |
There was a problem hiding this comment.
Do we both need to be in GitHub actions and Pre-commit? If not, I prefer only being in pre-commit.
There was a problem hiding this comment.
There is a technical catch with using only pre-commit.ci:
The pre-commit.ci bot runs in an isolated environment and cannot run vale sync to download the external Google styles. This causes Runtime Error unless the styles are present locally.
If we want to use only pre-commit, I will need to commit the entire Google style definition folder (inside .github/styles/Google) directly into the repository so the bot can access them.
Why I chose the Action: I saw that in PR #2568, there was hesitation about adding the Microsoft style files directly to the repo. To respect that preference and keep the repo clean, I used a GitHub Action to download the styles on the fly, rather than committing them.
Are you comfortable with committing the Google style files (~33 YAML files) to the repository? If so, I can switch to pre-commit only. Otherwise, the GitHub Actions workflow provides a cleaner solution.
Let me know your preference!
There was a problem hiding this comment.
So a GitHub Action can dynamically load them while pre-commit cannot?
There was a problem hiding this comment.
Yes, exactly.
- GitHub Actions allows us to run shell commands, so the workflow I added explicitly runs vale sync to download the styles before the check starts.
- pre-commit.ci is a stricter, sandboxed environment. It does not run arbitrary setup commands (like vale sync) before the hooks. It expects all configuration files to be present in the repository itself.
That is why the Action works 'out of the box' while pre-commit.ci fails unless we commit the style files.
There was a problem hiding this comment.
Right, so in our case, what does vale in pre-commit do?
There was a problem hiding this comment.
In our current setup, Vale in pre-commit acts as an immediate feedback tool. Provided the developer has run vale sync once on their machine, pre-commit catches style errors before they push code. This saves them from waiting for CI to fail. We keep the pre-commit hook solely for the benefit of local developers (so they can lint while they code). We rely on the GitHub Action to perform the actual "official" check on the server.
| @@ -1,6 +1,9 @@ | |||
| <!-- vale off --> | |||
There was a problem hiding this comment.
I don’t understand how vale is off here, but there are still changes in the document
There was a problem hiding this comment.
I first made vale run through the file to identify any spelling errors in the file. Then I turned vale off for that file to avoid noise from false positives, specifically regarding spell checking.
The Issue:
Vale's default spell checker treats technical terms (e.g., variable names, module paths, library-specific jargon) as spelling errors if they aren't in its standard dictionary. Since HISTORY.md is a log of code changes, it is naturally full of these terms.
What I tried first:
I initially attempted to fix this by building a project-specific vocabulary (accept.txt), adding valid technical terms to it so Vale would recognize them.
Why that didn't work:
I quickly found that the number of unique variable names and technical terms in the history log is massive. Maintaining an accept.txt list for every single new parameter or method mentioned in the changelog would become a significant maintenance burden (the file would grow indefinitely and require constant updates).
Since we already use codespell (which is better tuned for codebases), I decided it was cleaner to rely on codespell for catching typos in these specific files and use Vale purely for enforcing style/prose in the documentation, where it shines. If you need vale to be enabled in those 2 files, I can make changes to the file and enable it.
There was a problem hiding this comment.
Would it make sense to limit it to certain folders?
There was a problem hiding this comment.
Yes, we can certainly do that.
Currently, the configuration targets the docs/ folder AND *.md (files in the root directory).
If we limit Vale strictly to the docs/ folder:
- Pro: It automatically ignores
HISTORY.mdandCODE_OF_CONDUCT.mdwithout needing any exclude rules. - Con: It would stop linting the main
README.mdof the project (since it sits in the root), and that is often the most visible file for new users.
I stuck with the current approach (checking root files + manual excludes) specifically to keep the main README.md covered.
There was a problem hiding this comment.
Let’s start in this PR with /docs only, and then we can discuss if we want to extend it later.
There was a problem hiding this comment.
I have updated the PR to include only the changes in /docs.
There was a problem hiding this comment.
@EwoutH, can you confirm that this PR is now ready to be merged?
d3acfa9 to
5147810
Compare
…ter colons in technical docs
…into vale-final Merged in vale-final (pull request mesa#1)
EwoutH
left a comment
There was a problem hiding this comment.
While I still have some doubts about the proportionality (added value vs increased complexity), it's a relatively unintrusive, standalone implementation that could be easily reverted. So lets give it the benefit of the doubt.
|
my hunch is that some vale init needs to be run locally to download the relevant styles. |
|
Hi @EwoutH, thanks for testing this locally! The error To fix this, you just need to fetch the styles once. Please run: vale sync |
|
Might it be an idea to update |
|
Good idea, agreed. @vedantvakharia could you add a small snippet under https://github.com/mesa/mesa/blob/main/CONTRIBUTING.md#testing-and-code-standards? |
|
While I appreciate the effort, I would suggest revering this PR. pre-commit is already complex, especially for new contributors, and this makes it even more complex. I don't think that complexity weights up against the limited value vale adds. Maybe we can run it separately, in an isolated GitHub action, once a month or so. |
|
I don't understand your point. I haven't run into any issues with vale yet in precommit. |

Summary
Closes #1572
This PR implements Vale, a syntax-aware linter for prose, to ensure consistent documentation style across the project. It includes configurations for GitHub Actions (CI) and local pre-commit hooks.
Implementation Details
1. Configuration (
.vale.ini)codespell).suggestion(non-blocking) so they do not prevent commits.2. Custom Project Rules (
.github/styles/mesa.yml)error(blocking) level alerts in pre-commit, ensuring critical terminology (e.g., "Mesa" vs "mesa") is enforced strictly once enabled.3. Workflow Integration
.pre-commit-config.yamlfor local linting..github/workflows/vale.ymlto run checks on Pull Requests..gitignoreto prevent cluttering the repository with external dependency files.4. Exclusions
HISTORY.mdandCODE_OF_CONDUCT.mdfrom linting to preserve historical logs and standard community text.Questions for Reviewers
mesa.ymlis currently a placeholder. Are there specific terms (besides "Mesa" and "NumPy") you would like strictly enforced immediately?