Skip to content

feat(csv): add CSV validation to CI and pre-commit#389

Merged
Shillaker merged 5 commits intoBoavizta:mainfrom
ju3ouz4n:feat/issue-385-csv-check-precommit-and-action
Jan 5, 2026
Merged

feat(csv): add CSV validation to CI and pre-commit#389
Shillaker merged 5 commits intoBoavizta:mainfrom
ju3ouz4n:feat/issue-385-csv-check-precommit-and-action

Conversation

@ju3ouz4n
Copy link
Copy Markdown
Contributor

@ju3ouz4n ju3ouz4n commented Mar 31, 2025

Following discussions on #385 :
After discussing on this issue with @jonperron decided to give checkcsv a try, looks interesting and we could use same validating "format" file for CI and pre-commit

  • Adds 2 new dependencies "pre-commit" and "chkcsv" SEE 077405a - poetry.lock is a bit disappointing :(
  • Adds fmt format files following chkcsv format for each file that needs to be check SEE dea256d will certainly need to be refined
  • Adds pre-commit file conf, needs to install via pre-commit install prior testing SEE b4ab1a1
  • Adds gh-actions workflow file, using a new make target checkcsv SEE 511e099

I'm not and expert on poetry and dependency management with python, but it looks like there is a lots of modifications (downgrade??) on the poetry.lock, might be an issue.

Fixes #385

@jonperron
Copy link
Copy Markdown
Collaborator

Pre-commit has several dependencies as written in the lock file:

[package.dependencies]
cfgv = ">=2.0.0"
identify = ">=1.0.0"
nodeenv = ">=0.11.1"
pyyaml = ">=5.1"
virtualenv = ">=20.10.0"

That's why there are numerous differences (I have nearly the same in https://github.com/Boavizta/boaviztapi/pull/387/files#diff-f53a023eedfa3fbf2925ec7dc76eecdc954ea94b7e47065393dbad519613dc89) in the file.

@Shillaker Shillaker force-pushed the feat/issue-385-csv-check-precommit-and-action branch 2 times, most recently from c18da14 to 3b009fd Compare January 3, 2026 11:51
@Shillaker Shillaker changed the title Feat/issue 385 csv check precommit and action feat(csv): add CSV validation to CI and pre-commit Jan 3, 2026
@Shillaker Shillaker force-pushed the feat/issue-385-csv-check-precommit-and-action branch from 3b009fd to a7badd5 Compare January 3, 2026 12:04
@Shillaker Shillaker force-pushed the feat/issue-385-csv-check-precommit-and-action branch from a7badd5 to 37dabd4 Compare January 3, 2026 12:08
@Shillaker
Copy link
Copy Markdown
Collaborator

Shillaker commented Jan 3, 2026

In addition to rebasing to main, I made the following changes:

  • Symlink a single shared cloud.fmt file for all cloud VMs (aws, azure, gcp, scaleway)
  • Complete the CPU CSV spec, make CSV validation strict on line length, and fix up failing lines in cpu_spec.csv
  • Add Python script for running CSV validation so that we have a single entrypoint when running with make and pre-commit
  • Clean up pre-commit config file and include CSV check
  • Add new Poetry dependency group csv to allow running poetry install --only csv in the Github action

Copy link
Copy Markdown
Collaborator

@jonperron jonperron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a huge fan of symlink in repositories, but if you are confident that it would work on any dev environment (in particular Windows), then I'm fine with it.

I also added a nitpick comment.

poetry install --with dev

check-csv:
poetry run python3 validate_csv.py
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with poetry run, does it use the python3 from the venv ? If not and it relies on the python3 of the system, then I would rather set up a variable at the top of the file (such as PYTHON=venv/bin/python3 or whatever), so it does not complain because the chckcsv library is not installed.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, poetry run uses the Python from its own virtual environment, so we can guarantee that chckcsv is installed.

In my local checkout of this repo:

$ poetry run which python
/home/sshillaker/.cache/pypoetry/virtualenvs/boaviztapi-WPfJjdvf-py3.13/bin/python

$ which python
/home/sshillaker/.pyenv/shims/python

@jonperron
Copy link
Copy Markdown
Collaborator

You should also add the newly created .fmt files in dockerignore, so we don't take unecessary files in the image.

@Shillaker
Copy link
Copy Markdown
Collaborator

Shillaker commented Jan 5, 2026

Thanks for the comments @jonperron. I made the following changes:

  • Remove the symlinks and explicitly specify any non-default format file in the CSV check script
  • Add *.fmt to dockerignore

@Shillaker Shillaker merged commit fa60d48 into Boavizta:main Jan 5, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add csv checker to pre-commit

3 participants