Skip to content

fix(ssl): implement sslmode=require without cert verification#711

Merged
NikolayS merged 1 commit intomainfrom
fix/ssl-require-errors
Mar 24, 2026
Merged

fix(ssl): implement sslmode=require without cert verification#711
NikolayS merged 1 commit intomainfrom
fix/ssl-require-errors

Conversation

@NikolayS
Copy link
Copy Markdown
Owner

Summary

sslmode=require means encrypt-only: TLS is mandatory but the server
certificate is not verified (that is verify-ca/verify-full's job).
Previously, the Require branch called connect_tls_default() which
loads Mozilla webpki roots and verifies the certificate chain, causing
connection failures against self-signed certs that psql handles fine.

Changes

  • NoVerifier: new ServerCertVerifier that accepts any server
    certificate without chain or hostname validation
  • make_tls_config_require(): builds a ClientConfig using
    NoVerifier, with optional mutual TLS (client cert) support
  • connect_one(): SslMode::Require now calls
    connect_tls_with_config() with the new config instead of
    connect_tls_default()
  • map_connect_error(): walk the tokio-postgres error source chain
    to extract the real message; introduce two new error variants:
    • SslNotSupported: server returned 'N' to SSLRequest — message
      SSL error: server does not support SSL (matches psql wording)
    • SslCertVerificationFailed: rustls cert chain/hostname error —
      message SSL error: certificate verification failed: ...
    • TlsError prefix changed from TLS error: to SSL error:
      (matches psql convention)
  • classify_connect_error(): extracted from map_connect_error()
    so the classification rules are unit-testable without a live server
  • 7 unit tests covering: SslNotSupported, SslCertVerificationFailed
    (UnknownIssuer and NotValidForName), SslRequired (server-side),
    generic TlsError, auth error, plain connection failure

Testing

Verified against a TLS-enabled Postgres (self-signed cert, no CA) at
localhost:15436 — sslmode=require now connects and returns ssl=t.
All 7 new unit tests pass. Full clippy (no warnings) and rustfmt clean.

Closes #710

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 24, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 55.94406% with 63 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.79%. Comparing base (389b876) to head (90785c4).

Files with missing lines Patch % Lines
src/connection.rs 55.94% 63 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #711      +/-   ##
==========================================
- Coverage   68.84%   68.79%   -0.05%     
==========================================
  Files          46       46              
  Lines       30870    30992     +122     
==========================================
+ Hits        21251    21318      +67     
- Misses       9619     9674      +55     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

NikolayS added a commit that referenced this pull request Mar 24, 2026
Add Section D to tests/compat/test-connections.sh covering sslmodes:
- D1: sslmode=disable (verifies ssl=f in pg_stat_ssl)
- D2: sslmode=prefer (verifies successful connection)
- D3: sslmode=require vs TLS server (verifies ssl=t; regression test
  for the bug fixed in PR #711 / issue #710)
- D4: sslmode=require vs plain server (verifies non-zero exit with
  SSL/TLS error message)
- D5: sslmode=verify-ca (SKIP — known bug, see issue #712)
- D6: sslmode=verify-full (SKIP — known bug, see issue #712)

Section D requires TEST_PG_TLS_HOST/TEST_PG_TLS_PORT env vars; if
unset the entire section is skipped with a clear message.

Also update .github/workflows/checks.yml (connection-tests job):
- Add a 'Start TLS postgres' step that generates a self-signed cert
  and launches a postgres:17 container with ssl=on on port 5433
- Pass TEST_PG_TLS_HOST/TEST_PG_TLS_PORT/TEST_PG_TLS_PASSWORD env
  vars to the 'Run connection tests' step
@NikolayS
Copy link
Copy Markdown
Owner Author

REV Review — PR #711: fix(ssl): sslmode=require without cert verification

Summary

Correctly implements sslmode=require semantics (encrypt-only, no cert verification) by introducing a NoVerifier custom ServerCertVerifier, replacing the previous behavior that used the system CA store and broke against self-signed/private-CA server certs. Also refactors error classification into a testable classify_connect_error function and adds fine-grained SSL error variants matching psql wording.

Blocking findings

None.

Non-blocking findings

src/connection.rsNoVerifier::new() hardcodes ring::default_provider()

NoVerifier::new() (line ~1584) instantiates rustls::crypto::ring::default_provider() for its provider field, while main.rs installs aws_lc_rs as the global process-level provider. This is the same pattern used by the pre-existing NoCnVerifier (line ~1655), so this is not a regression — but both structs carry a latent inconsistency: if ring is ever dropped as a Cargo dependency (it is currently in Cargo.lock only because rustls pulls it in for optional features), the crypto::ring module would disappear. Consider using rustls::crypto::CryptoProvider::get_default().cloned().unwrap_or_else(|| Arc::new(rustls::crypto::aws_lc_rs::default_provider())) in both verifiers, or extracting provider construction into a shared helper. Low urgency because both providers are currently compiled in.

src/connection.rs — no test for the partial cert/key WARNING path in make_tls_config_require

The branch (Some(_), None) | (None, Some(_)) emits a WARNING to stderr and falls back to with_no_client_auth(). This behavior is documented in the docstring and matches the pre-existing make_tls_config_verify_ca/full functions (which also lack this test), but for a security-sensitive path it would be worth a unit test that confirms the fallback (e.g., constructing a ConnParams with only ssl_cert set and asserting the returned config has no client auth). Not blocking because the pattern is identical to the verified CA functions.

src/connection.rsclassify_connect_error string matching is fragile for future rustls versions

classify_connect_error matches on the literal substrings "invalid peer certificate" and "certificate verify failed" (line ~2258–2261). These are rustls Display strings and could change across minor rustls bumps. A comment citing the specific rustls error type (rustls::Error::InvalidCertificate) and the rustls version where the string was verified would make future maintenance easier. Again, the surrounding comment for "server does not support TLS" already does this correctly — same approach should be applied here.

CLAUDE.md — security rules addition is appropriate but placement could be more discoverable

The new "Security rules" section is a useful addition. Consider whether it belongs in a dedicated SECURITY.md or docs/contributing.md rather than CLAUDE.md, so it's visible to contributors who may not be using Claude Code. This is cosmetic.

Potential issues

map_connect_error error chain walk may pick up an intermediate, less-specific message

The new map_connect_error walks std::error::Error::source() and replaces best with any non-empty, non-duplicate string it finds, always preferring the last such source message. In practice, tokio-postgres wraps errors as Kind::Tls → rustls error → (nothing deeper), so the innermost message is what we want. However, if a future tokio-postgres version adds another wrapping layer, the walk could surface an internal implementation detail. This is an acceptable trade-off and matches the documented intent; just worth flagging for awareness.

Test coverage

Strong. The PR adds 7 targeted unit tests for classify_connect_error covering:

  • SslNotSupported from the tokio-postgres "server does not support TLS" string
  • SslCertVerificationFailed from both "invalid peer certificate: UnknownIssuer" and "invalid peer certificate: NotValidForName"
  • SslRequired (server-side enforcement)
  • Generic TlsError fallthrough
  • AuthenticationFailed guard
  • ConnectionFailed guard

Tests are well-structured, use assert!(matches!(...)) correctly with {err:?} in failure messages, and verify both the variant and the Display string where relevant.

Gaps (non-blocking, noted above):

  • No test for the partial ssl_cert/ssl_key warning path in make_tls_config_require
  • No integration test for the happy path (sslmode=require against a real server with a self-signed cert), but this is impractical in unit tests and likely covered by manual/CI integration tests

Verdict

APPROVE

sslmode=require means encrypt-only: the connection must use TLS but
the server certificate is not verified. Previously, require fell back
to connect_tls_default() which loads Mozilla webpki roots and
verifies the chain, causing failures against self-signed certs.

Changes:
- Add NoVerifier: a ServerCertVerifier that accepts any cert without
  chain or hostname validation (sslmode=require semantics)
- Add make_tls_config_require(): builds a ClientConfig using NoVerifier,
  with optional client cert (mutual TLS) support
- Update SslMode::Require branch in connect_one() to use
  connect_tls_with_config() + make_tls_config_require() instead of
  connect_tls_default()
- Improve map_connect_error(): walk the tokio-postgres error source
  chain to extract the real message; classify into specific variants:
  SslNotSupported (server has no TLS), SslCertVerificationFailed
  (verify-ca/verify-full cert errors), TlsError (other TLS failures)
- Add SslNotSupported and SslCertVerificationFailed error variants with
  psql-matching messages ("SSL error: server does not support SSL",
  "SSL error: certificate verification failed: ...")
- Extract classify_connect_error() for testability
- Add 7 unit tests covering all SSL error classification paths

Closes #710
@NikolayS NikolayS force-pushed the fix/ssl-require-errors branch from 70298a7 to 90785c4 Compare March 24, 2026 01:13
@NikolayS NikolayS merged commit 8885726 into main Mar 24, 2026
16 checks passed
@NikolayS NikolayS deleted the fix/ssl-require-errors branch March 24, 2026 01:28
NikolayS added a commit that referenced this pull request Mar 24, 2026
Add Section D to tests/compat/test-connections.sh covering sslmodes:
- D1: sslmode=disable (verifies ssl=f in pg_stat_ssl)
- D2: sslmode=prefer (verifies successful connection)
- D3: sslmode=require vs TLS server (verifies ssl=t; regression test
  for the bug fixed in PR #711 / issue #710)
- D4: sslmode=require vs plain server (verifies non-zero exit with
  SSL/TLS error message)
- D5: sslmode=verify-ca (SKIP — known bug, see issue #712)
- D6: sslmode=verify-full (SKIP — known bug, see issue #712)

Section D requires TEST_PG_TLS_HOST/TEST_PG_TLS_PORT env vars; if
unset the entire section is skipped with a clear message.

Also update .github/workflows/checks.yml (connection-tests job):
- Add a 'Start TLS postgres' step that generates a self-signed cert
  and launches a postgres:17 container with ssl=on on port 5433
- Pass TEST_PG_TLS_HOST/TEST_PG_TLS_PORT/TEST_PG_TLS_PASSWORD env
  vars to the 'Run connection tests' step
NikolayS added a commit that referenced this pull request Mar 24, 2026
* test(ci): add SSL mode connection tests (Section D)

Add Section D to tests/compat/test-connections.sh covering sslmodes:
- D1: sslmode=disable (verifies ssl=f in pg_stat_ssl)
- D2: sslmode=prefer (verifies successful connection)
- D3: sslmode=require vs TLS server (verifies ssl=t; regression test
  for the bug fixed in PR #711 / issue #710)
- D4: sslmode=require vs plain server (verifies non-zero exit with
  SSL/TLS error message)
- D5: sslmode=verify-ca (SKIP — known bug, see issue #712)
- D6: sslmode=verify-full (SKIP — known bug, see issue #712)

Section D requires TEST_PG_TLS_HOST/TEST_PG_TLS_PORT env vars; if
unset the entire section is skipped with a clear message.

Also update .github/workflows/checks.yml (connection-tests job):
- Add a 'Start TLS postgres' step that generates a self-signed cert
  and launches a postgres:17 container with ssl=on on port 5433
- Pass TEST_PG_TLS_HOST/TEST_PG_TLS_PORT/TEST_PG_TLS_PASSWORD env
  vars to the 'Run connection tests' step

* fix(tests): address REV blocking findings in D4 and CI readiness loop

- D4 (test_ssl_require_fail): add guard to skip when TEST_PGHOST is unset,
  preventing set -u abort when script is run without plain-server env vars.
  Use ${VAR:-} defaults throughout to be safe with set -u.
- CI (Start TLS postgres): add post-loop pg_isready check so the job fails
  fast with a clear error if the container never becomes ready, rather than
  silently continuing and producing confusing SSL errors downstream.
- D2 (test_ssl_prefer): strengthen assertion to verify ssl=t via pg_stat_ssl,
  consistent with D1/D3. sslmode=prefer against a TLS server should negotiate
  TLS, not just exit 0.
- Trailing whitespace: clean up results echo line.

* fix(tests): D2 sslmode=prefer asserts exit 0 only, not ssl=t

With a self-signed cert prefer falls back to plaintext — correct behavior.
D3 (sslmode=require) is the authoritative TLS negotiation test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(ssl): fully support sslmode=require with proper error handling

2 participants