Resurrect SQLLogic test in CI#97581
Conversation
The SQLLogic test (`tests/sqllogic/`) runs the sqllogictest corpus against ClickHouse, comparing results with SQLite as a reference. It was disabled in Aug 2024 (PR #68654) due to an unhandled `sqlglot` parse error, and the CI infrastructure was deleted in Mar 2025. This change: - Fixes `sqlglot` crashes in `test_parser.py` by wrapping both the CREATE TABLE transpilation and the CAST/NULL transformation paths in `try/except`, falling back to original SQL on failure - Replaces the ODBC dependency with the native `clickhouse_driver.dbapi` driver (already available in the Docker image), adding a `NATIVE` engine to `connection.py` with `NativeConnectingArgs` - Updates `runner.py` to use the `NATIVE` engine instead of `ODBC` - Adds `sqlglot` and `deepdiff` to `ci/docker/stateless-test/requirements.txt` - Creates `ci/jobs/sqllogic_test.py` CI job script that clones the sqllogictest repo, then runs self-test, statements-test, and complete-test - Wires the new `SQLLogic test` job into the Praktika CI master workflow - Updates `canonic_report.json` to match new driver and `sqlglot` version behavior Changelog category: CI Fix or improvement Changelog entry: Resurrect SQLLogic test in CI, replacing ODBC with native driver Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Local test resultsRan the full SQLLogic test suite locally against a debug build. Self-testStatements test222 failures breakdown:
Complete test480,145 failures breakdown:
These results are consistent with the historical results before the test was disabled. The failures are expected ClickHouse-vs-SQLite behavioral differences, not test runner bugs. |
Failure classification (complete test: 480,145 / 3,880,035 failures)1.
|
- `cast_keep_nullable` fixes 260k failures (54.3%) where ClickHouse threw `Cannot convert NULL to a non-nullable type` in WHERE clauses like `WHERE NULL <= CAST(NULL AS REAL)`. SQLite evaluates these as false and returns empty results. - `prefer_column_name_to_alias` fixes ~60k failures (12.4%) where ClickHouse resolved `col0` in WHERE to the aggregate alias `COUNT(*) AS col0` instead of the table column `tab.col0`, matching SQLite's behavior. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Add `enable_block_number_column` and `enable_block_offset_column` SETTINGS to CREATE TABLE transpilation so that lightweight mutations are supported. Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add `allow_nullable_key = 1` to CREATE TABLE SETTINGS in `test_parser.py` to fix nullable primary key columns rejection (Code 44) - Cap the multiprocessing pool at 16 workers in `runner.py` to prevent overwhelming the ClickHouse server on machines with many cores - Update canonic self-test report to reflect the improved results Results: - Statements test: 100% pass rate (all 213 tests passed) - Complete test: 99.27% pass rate (15,130 / 2,058,307 failures) Co-Authored-By: Claude Opus 4.6 <[email protected]>
SQLLogic Test Results (Updated with new settings)Settings applied:
Results:
Failure Classification (15,130 total failures):
Key examples:1.
2.
3. Aggregate over empty set (3.2%):
|
Standard SQL returns NULL for aggregates over empty sets (e.g., `AVG(-78) ... WHERE <always-false>`), while ClickHouse returns NaN/0. Enable `aggregate_functions_null_for_empty = 1` to match the expected behavior. Co-Authored-By: Claude Opus 4.6 <[email protected]>
SQLLogic Test Results (latest run)Settings enabled: Table settings: Statements test: 100% pass rate (213/213)Complete test: 99.34% pass rate (13,665 / 2,058,307 failures)
All categories are known ClickHouse-vs-SQLite behavioral differences, not bugs. |
When the parser strips parentheses from `IN (col)` (single element), it produces `IN col` with a bare `ASTIdentifier`. After resolution this becomes a `ColumnNode`, which is not handled by `CollectSets`. Wrap it in `tuple()` so the existing tuple → `has` rewrite path handles it. This fixes queries like `SELECT * FROM t WHERE x IN (col)`. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Updated SQLLogic Test Results (after
|
| # | Category | Count | % | Description |
|---|---|---|---|---|
| 1 | NOT operator precedence | 10,510 | 94.7% | NOT (col0) IS NULL parsed as (NOT col0) IS NULL instead of NOT (col0 IS NULL). Parens around the operand cause the Pratt parser to greedily consume NOT (expr) as a prefix expression. Without parens (NOT col0 IS NULL) it works correctly. |
| 2 | Duplicate column aliases | 510 | 4.6% | SELECT col0, expr AS col0 rejected by ClickHouse, allowed by SQLite. |
| 3 | Integer division | 40 | 0.4% | -7/2 = -4 (floor) in ClickHouse vs -3 (truncate toward zero) in SQLite. |
| 4 | IN with column refs (remaining) | 30 | 0.3% | Edge cases with complex expressions in IN lists. |
| 5 | Parenthesized FROM | 5 | <0.1% | FROM (tab0 CROSS JOIN tab1) not supported. |
NOT precedence details
Verified against PostgreSQL, MySQL, DuckDB, SQLite — all agree with the SQL standard: NOT has lower precedence than IS NULL, IN, BETWEEN, LIKE. ClickHouse's Pratt parser also assigns NOT lower precedence (priority 5) than IS NULL (6), and NOT col0 IS NULL parses correctly. But NOT (col0) IS NULL breaks: the parser consumes (col0) as a complete subexpression, applies NOT to it, then IS NOT NULL binds to the result.
Settings used
Server: default_table_engine=MergeTree, union_default_mode=DISTINCT, group_by_use_nulls=1, join_use_nulls=1, allow_create_index_without_type=1, create_index_ignore_unique=1, cast_keep_nullable=1, prefer_column_name_to_alias=1, aggregate_functions_null_for_empty=1
Table: allow_nullable_key=1, enable_block_number_column=1, enable_block_offset_column=1
Standard SQL allows parentheses around joined table expressions: `SELECT * FROM (t1 CROSS JOIN t2)`. ClickHouse previously rejected this with a syntax error because `ParserSubquery` only accepted SELECT/EXPLAIN inside parentheses. The fix adds a fallback in `ParserTableExpression`: when all normal parse attempts fail and the position is at `(`, try to parse the content as `ParserTablesInSelectQuery`. If it contains a join (more than one table element), convert it to `SELECT * FROM <tables>` and wrap as a subquery. Single-table parenthesization `(t1)` is not matched to avoid ambiguity with expression parentheses. Co-Authored-By: Claude Opus 4.6 <[email protected]>
The Pratt parser had a special case: when `NOT` (as a unary operator) was followed by `(`, it was treated as a function call `not(...)` instead of a unary prefix operator. This caused `NOT (col) IS NULL` to be parsed as `isNull(not(col))` instead of the SQL-standard `not(isNull(col))`. The fix removes this special case, always treating `NOT` as a unary prefix operator via `pushOperator`. The explicit function-call syntax `not(expr)` still works through the separate function parsing path at line 2364. This aligns ClickHouse with PostgreSQL, MySQL, DuckDB, and SQLite, all of which give `IS NULL`, `IS NOT NULL`, `IN`, `BETWEEN`, and `LIKE` higher precedence than `NOT`. Co-Authored-By: Claude Opus 4.6 <[email protected]>
When `NOT` is followed by `(`, the parser treats it as a function call `not(...)`. With multiple comma-separated values like `NOT (1, 2, 3)`, this produced `not(1, 2, 3)` — a function with 3 arguments, which is semantically wrong since `NOT` is a unary operator. Now, when the `not` function layer closes with more than 1 argument, the arguments are wrapped in a `tuple()`: `not(tuple(1, 2, 3))`. This changes the error from `NUMBER_OF_ARGUMENTS_DOESNT_MATCH` to the more appropriate `ILLEGAL_TYPE_OF_ARGUMENT`, and ensures consistent AST formatting round-trips. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Classify all 201,707 query failures from the full sqllogictest corpus (5.9M tests total, 96.6% pass rate) by root cause: - Integer division semantics (42%): ClickHouse `/` returns Float64 - Duplicate alias rejection (21%): `SELECT x AS col, y AS col` - `NOT` operator precedence (17%): `NOT (x) IS NULL` parsed as `isNull(not(x))` - Type system strictness (8%): no Int/Float supertype, NULL to non-Nullable - Subquery/IN behavior (8%): different NULL handling - SQL compatibility gaps (1%): parenthesized FROM, GROUP BY strictness Co-Authored-By: Claude Opus 4.6 <[email protected]>
Move the classification to a PR comment instead. Co-Authored-By: Claude Opus 4.6 <[email protected]>
SQLLogic Test: Full Test Suite Results (5.9M tests)
Statement Failures (8)All 8 are unsupported SQLite-specific features:
Query Failures by Root Cause (201,699)
Detailed Exception Breakdown (60,186 queries that threw exceptions)
Settings used: |
Remove the special case that treated `NOT (` as a function call. Instead, always push `NOT` as a unary prefix operator via `pushOperator`. This gives `NOT` correct SQL-standard precedence (lower than `IS NULL`, `BETWEEN`, and `LIKE`), so that: - `NOT (x) IS NULL` parses as `NOT (x IS NULL)` instead of `isNull(not(x))` - `NOT (x) BETWEEN a AND b` parses as `NOT (x BETWEEN a AND b)` - `NOT (x) LIKE pattern` parses as `NOT (x LIKE pattern)` Also remove the tuple-wrapping hack for multi-argument `NOT` since it is no longer needed: `NOT (1, 1, 1)` is now parsed as `NOT tuple(1, 1, 1)` via the Pratt parser and `RoundBracketsLayer`. This changes `NOT (0) + NOT (0)` from `2` to `0`, which is the correct SQL-standard behavior since `NOT` has lower precedence than `+`. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Resolve conflicts: - ci/workflows/master.py: keep `sqllogic_test_master_job` from PR, use `llvm_coverage_job` from master - .github/workflows/master.yml and pull_request.yml: regenerated via `praktika yaml` - tests/queries/0_stateless/03168_inconsistent_ast_formatting.reference: keep PR's `NOT` parenthesization Co-Authored-By: Claude Opus 4.6 <[email protected]>
The merged parser produces master's formatting behavior for NOT with tuple arguments (no extra parentheses). Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
Remove `sqlglot` dependency from `test_parser.py`: ClickHouse natively handles SQLite types (`INTEGER`, `TEXT`, `VARCHAR`), and settings `default_table_engine`, `allow_create_index_without_type`, `create_index_ignore_unique`, and `cast_keep_nullable` handle the rest. Simplify `convert_request` to only append SETTINGS for nullable keys and block columns. Remove the `min(16, ...)` cap on parallel processes in `runner.py`, letting the test use all available CPUs. Complete test suite results: 5,765,647/5,939,581 passed (97.1%). Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
|
Generate a color-coded HTML report showing per-stage test breakdown (statements/queries, top failing test files) and enforce success thresholds on the `complete-clickhouse` stage: minimum 5,939,581 total tests, maximum 201,707 failed tests. The report is attached as an artifact in the Praktika CI dashboard. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Each failing test file entry now shows a breakdown like `(200 stmt, 300 query)` so it is clear whether failures are in statements, queries, or both. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replace the per-file failure listing with a failure classification table that groups errors by their root cause: hash mismatch, row count mismatch, value mismatch, column count mismatch, ClickHouse error codes (e.g. `Code 179`, `Code 349`), and others. Each category shows the count and percentage of total failures. Co-Authored-By: Claude Opus 4.6 <[email protected]>
…ailure category Set `MAX_FAILED_TESTS` to 173,986 and add up to 5 example queries for each failure category in the SQLLogic HTML report. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Closes #15112.
Summary
tests/sqllogic/) that was disabled in Aug 2024 (PR CI: Disable SQLLogic job #68654) due to an unhandledsqlglotparse error, and whose CI infrastructure was deleted in Mar 2025sqlglotcrashes intest_parser.pyby wrapping CREATE TABLE transpilation and CAST/NULL transformation intry/except, falling back to original SQL on failureclickhouse_driver.dbapidriver (already in the Docker image), adding aNATIVEengine toconnection.pyci/jobs/sqllogic_test.pyCI job and wire it into the Praktika master workflowINoperator (IN (col)→IN tuple(col))FROMclauseNOT (1, 1, 1)by wrapping in tupleLocal test results
Self-test: Passed
Statements test (210,711 tests): 210,703 passed, 8 failed (99.996%)
DROP INDEXwithoutON,REINDEX,CREATE TRIGGER,UPDATEwithoutWHERE,UNIQUEconstraint,INSERT OR REPLACE,count(DISTINCT *))Complete test (5,939,581 tests): 5,737,874 passed, 201,707 failed (96.6%)
35/50→0in SQLite vs0.7in ClickHouse)EXISTSsubquery behavior differencesrandom/exprtests due to arithmetic differencesTest plan
SQLLogic testjob successfullyChangelog category:
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Resurrect SQLLogic test in CI, replacing ODBC with native driver
🤖 Generated with Claude Code