Skip to content

Add --validate mode to check input syntax without running a query #88

Description

@vmvarela

Description

Users working with untrusted or generated data often want to verify that a file is well-formed before running queries. Today this requires piping through a no-op query (SELECT 1) and interpreting exit codes. A dedicated --validate flag makes the intent explicit and gives better error output.

Supported input formats: CSV, TSV, JSON, NDJSON (via -I / --input-format).

Example

# Valid CSV
$ cat good.csv | sql-pipe --validate
OK: 1,234 rows, 5 columns (id INTEGER, name TEXT, amount REAL, region TEXT, date TEXT)
$ echo $?
0

# Invalid CSV
$ cat bad.csv | sql-pipe --validate
error: row 42: unterminated quoted field
$ echo $?
2

# Valid JSON array
$ cat data.json | sql-pipe --validate -I json
OK: 3 rows, 2 columns (id TEXT, name TEXT)

# Valid NDJSON
$ cat data.ndjson | sql-pipe --validate -I ndjson
OK: 3 rows, 2 columns (id TEXT, name TEXT)

Acceptance Criteria

  • --validate parses the entire input and prints a summary to stdout on success: OK: <n> rows, <m> columns (<col> <TYPE>, ...)
  • On parse error, print the existing error message (row number + description) and exit 2
  • --validate does not run any SQL query — no query argument required or used
  • Works with --delimiter, --tsv, --no-type-inference
  • Works with -I json and -I ndjson (columns reported as TEXT)
  • Exit 0 on success, exit 2 on parse error
  • Documented in --help, README.md, and docs/sql-pipe.1.scd
  • Tests: valid CSV/JSON/NDJSON exit 0 with correct summary; malformed input exits 2

Notes

  • CSV/TSV: reuses the existing parser and type inference pipeline; skips sqlite3_exec entirely
  • JSON: parses the full array, counts items, extracts column names from the first object (all types TEXT)
  • NDJSON: streams line by line, counts objects, extracts column names from the first object (all types TEXT)
  • The column type summary for CSV/TSV requires running type inference (first 100 rows)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions