Skip to content

feat(persistence): Add JDBC persistence layer for Iceberg metrics reporting (#3337)#3385

Merged
flyrain merged 59 commits intoapache:mainfrom
obelix74:feat-3337-rest-catalog-metrics-table-merged
Feb 27, 2026
Merged

feat(persistence): Add JDBC persistence layer for Iceberg metrics reporting (#3337)#3385
flyrain merged 59 commits intoapache:mainfrom
obelix74:feat-3337-rest-catalog-metrics-table-merged

Conversation

@obelix74
Copy link
Contributor

@obelix74 obelix74 commented Jan 8, 2026

Checklist

  • 🛡️ Don't disclose security issues! (contact [email protected])
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

Description

This PR introduces the JDBC persistence layer for storing Iceberg metrics reports (scan and commit operations) as part of the Compute Client Audit Reporting feature (#3337).

Overview

This change adds the foundational persistence infrastructure required to store Iceberg metrics data received via the REST catalog's /reportMetrics endpoint. The metrics data includes detailed information about scan operations (file counts, data volumes, filter expressions) and commit operations (added/removed files, snapshot details).

Components Introduced

Model Classes
• ModelScanMetricsReport - Immutable model representing scan metrics with 30+ fields
• ModelCommitMetricsReport - Immutable model representing commit metrics with 30+ fields

Converters
• MetricsReportConverter - Converts Iceberg ScanReport and CommitReport objects to persistence models
• ModelScanMetricsReportConverter - Maps JDBC ResultSet to ModelScanMetricsReport
• ModelCommitMetricsReportConverter - Maps JDBC ResultSet to ModelCommitMetricsReport

Persistence Methods (added to JdbcBasePersistenceImpl)
• writeScanMetricsReport() / writeCommitMetricsReport() - Insert metrics into the database
• queryScanMetricsReports() / queryCommitMetricsReports() - Query metrics by realm and catalog
• queryScanMetricsReportsByTraceId() / queryCommitMetricsReportsByTraceId() - Query metrics by OpenTelemetry trace ID
• deleteScanMetricsReportsOlderThan() / deleteCommitMetricsReportsOlderThan() - Time-based cleanup
• deleteAllMetricsReportsOlderThan() - Unified cleanup for both metric types

Testing

 • Unit tests for model classes (ModelScanMetricsReportTest, ModelCommitMetricsReportTest)
 • Integration tests for persistence operations (MetricsReportPersistenceTest)
 • Comprehensive test coverage for all CRUD operations, edge cases, and data integrity

Related Issues

 • Closes: #3337 (Compute Client Audit Reporting)

@obelix74 obelix74 requested a review from dimas-b January 8, 2026 19:35
@obelix74
Copy link
Contributor Author

obelix74 commented Jan 9, 2026

Retest this please.

@obelix74 obelix74 force-pushed the feat-3337-rest-catalog-metrics-table-merged branch from 461cfde to 2fc2a20 Compare January 9, 2026 22:13
@obelix74
Copy link
Contributor Author

@dimas-b As discussed, I have moved the AWS STS changes to its own PR. #3414

@obelix74 obelix74 force-pushed the feat-3337-rest-catalog-metrics-table-merged branch from 830085b to fcf2716 Compare January 11, 2026 02:40
@obelix74
Copy link
Contributor Author

@dimas-b thanks for the help in merging #3414.

I have rebased this PR against main - looks like the event code was simplified and rewritten. I have redone the event code to follow the changes.

I think this PR can be split into two.

PR 1: Enable metrics event emission
• EventAttributes.java - add REPORT_METRICS_REQUEST attribute
• IcebergRestCatalogEventServiceDelegator.java - emit BEFORE/AFTER events in reportMetrics()

PR 2: Add metrics persistence (depends on PR 1)
• PolarisPersistenceEventListener.java - handle AFTER_REPORT_METRICS, extract ScanReport/CommitReport data
• PolarisPersistenceEventListenerTest.java - new test file for persistence listener

If this helps, please let me know and I can split this PR into two.

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary comments, not an end-to-end review :)

@dimas-b
Copy link
Contributor

dimas-b commented Jan 17, 2026

@obelix74 : Re: splitting into two PRs - if it's not too much overhead, it might be preferable from my POV. I think persistence impl. may be a bit more involved as I commented above.

@obelix74 obelix74 marked this pull request as draft January 17, 2026 02:17
@obelix74
Copy link
Contributor Author

@obelix74 : Re: splitting into two PRs - if it's not too much overhead, it might be preferable from my POV. I think persistence impl. may be a bit more involved as I commented above.

On it. Moving this PR to draft until the event PR is merged and this branch / PR is rebased against main.

@obelix74
Copy link
Contributor Author

@obelix74 : Re: splitting into two PRs - if it's not too much overhead, it might be preferable from my POV. I think persistence impl. may be a bit more involved as I commented above.

On it. Moving this PR to draft until the event PR is merged and this branch / PR is rebased against main.

@dimas-b Raised #3468

@obelix74 obelix74 force-pushed the feat-3337-rest-catalog-metrics-table-merged branch from 35bbf1e to 1954427 Compare January 17, 2026 03:33
@obelix74
Copy link
Contributor Author

@obelix74 : Re: splitting into two PRs - if it's not too much overhead, it might be preferable from my POV. I think persistence impl. may be a bit more involved as I commented above.

On it. Moving this PR to draft until the event PR is merged and this branch / PR is rebased against main.

@dimas-b Raised #3468

@dimas-b I have further redone this PR into the following distinct commits.

The feat/add-metrics-persistence-handler branch now contains 7 well-organized commits:

Commit Message Files Purpose
132ca9b feat(events): Enable metrics event emission in reportMetrics() 8 PR#1 - Feature flag + event types
1954427 feat(events): Add metrics persistence handler for AFTER_REPORT_METRICS events 2 PR#2 - Basic event listener
bf37017 feat(persistence): Add database schema and model classes for metrics (PR#3) 9 Schema v4 + model classes
c04e60e feat(persistence): Add JDBC persistence implementation for metrics (PR#4) 4 JDBC layer + tests
ed8b107 feat(reporting): Add metrics reporters infrastructure (PR#5) 11 Reporters + configuration
6def4a4 feat(cleanup): Add metrics cleanup service and documentation (PR#6) 2 Cleanup service + docs
553045d test(integration): Update InMemoryBufferEventListenerIntegrationTest 1 Integration tests

PR#1 is #3468.

If need be, we can separate out each one of these self-contained commits into their own PRs for review.

@obelix74 obelix74 changed the title (feat) Add metrics persistence with dual storage strategy for Iceberg table operations (fixes #3337) (feat) Add metrics persistence for Iceberg table operations (fixes #3337) Jan 21, 2026
@obelix74 obelix74 changed the title (feat) Add metrics persistence for Iceberg table operations (fixes #3337) feat(persistence): Add database schema and persistence layer for Iceberg metrics reporting (#3337) Jan 21, 2026
obelix74 pushed a commit to obelix74/polaris that referenced this pull request Jan 21, 2026
Address reviewer feedback from PR apache#3385 (comment r2700474938):
- Introduce MetricsPersistence interface in polaris-core for backend-agnostic metrics persistence
- Add MetricsContext immutable class to encapsulate metrics context information
- Add NoOpMetricsPersistence for backends that don't support metrics persistence
- Add JdbcMetricsPersistence implementation that wraps JdbcBasePersistenceImpl
- Add MetricsPersistenceProducer CDI producer to select implementation at runtime
- Refactor PersistenceMetricsProcessor to use MetricsPersistence instead of instanceof checks

This design supports the upcoming NoSQL persistence backend (apache#3396) by allowing
each backend to provide its own MetricsPersistence implementation or use the
no-op implementation that silently discards metrics.
@obelix74 obelix74 marked this pull request as ready for review January 23, 2026 16:58
@dimas-b dimas-b requested a review from singhpk234 January 23, 2026 16:59
@obelix74
Copy link
Contributor Author

@adutra @dimas-b @singhpk234

This PR for metrics persistence is now ready for review.

PR1 (this PR): Database schema v4 and persistence layer for metrics storage.
• New tables: scan_metrics_report, commit_metrics_report with indexes
• Model classes and converters for Iceberg ScanReport/CommitReport
• Persistence methods: write, query (by table/trace ID), delete (for retention)
• Schema version checks for graceful degradation on v1-v3 deployments (review feedback from @dimas-b)

PR2 (follow-up, not raised yet): Metrics processing framework that uses this persistence layer.
• MetricsPersistence interface with JdbcMetricsPersistence and NoOpMetricsPersistence implementations (review feedback from @dimas-b)
• MetricsContext for passing request context (principal, trace IDs)
• CDI producer for selecting persistence implementation based on configuration
• Integration with the REST catalog metrics endpoint

@obelix74 obelix74 requested a review from dimas-b January 23, 2026 17:13
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more preliminary comments... Trying to understand end-to-end use cases 🤔

@obelix74 obelix74 requested a review from dimas-b January 23, 2026 17:13
Metrics tables are now part of the main schema and use the primary
datasource instead of a separate named 'metrics' datasource.

Changes:
- Deleted schema-metrics-v1.sql files (postgres and h2)
- Added metrics tables to main schema-v4.sql files
- Deleted MetricsSchemaBootstrapUtil.java
- Removed openMetricsSchemaResource from DatabaseType
- Simplified JdbcBasePersistenceImpl (removed metricsDatasourceOperations)
- Removed @nAmed('metrics') datasource injection from JdbcMetaStoreManagerFactory
- Removed metrics datasource examples from application.properties
- Updated CHANGELOG.md
- Updated tests to work with single datasource
@obelix74 obelix74 requested a review from dimas-b February 25, 2026 21:22
Comment on lines 170 to 172
// Run the set-up script to create the tables.
// Run the set-up script to create the tables (includes metrics tables).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: do we need this comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed comment "(includes metrics tables)" from JdbcMetaStoreManagerFactory.java

@Inject PolarisDiagnostics diagnostics;
@Inject PolarisStorageIntegrationProvider storageIntegrationProvider;
@Inject Instance<DataSource> dataSource;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

* @param conflictColumns The columns that form the conflict/key constraint.
* @return INSERT query with value bindings that ignores conflicts.
*/
public static PreparedQuery generateIdempotentInsertQuery(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we change the idempotent query in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generateIdempotentInsertQuery method is needed for metrics persistence to handle potential duplicate metric reports gracefully. When a compute engine (like Spark) retries a request due to network issues, it may send the same metric report multiple times with the same report ID.

Using idempotent inserts (ON CONFLICT DO NOTHING for PostgreSQL, MERGE for H2) ensures:

  1. The first insert succeeds and persists the metric
  2. Subsequent duplicate inserts are silently ignored without errors
  3. No data corruption or duplicate entries in the metrics tables

I can remove it from this PR if you feel it is not necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current conflict key will never cause conflicts, as report id is uuid.

  private static final List<String> METRICS_CONFLICT_COLUMNS = List.of("realm_id", "report_id");

Further more, I don't think we can actually easily distinguish two duplicated scan report. I'd suggest to remove it before we can actually figure a way to distinguish duplicated report.


// BasePersistence doesn't implement MetricsPersistence
return MetricsPersistence.NOOP;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the place I really hope any contributor to read though the code to understand how metastore mananger work. PolarisMetaStoreManager itself should be able to used directly here to persist metrics instead of grabbing the instance of JdbcBasePersistenceImpl, which violates the principal of abstraction, this layer shouldn't know any persistence-specific classes, it is coded against an interface instead of a concrete class, otherwise, this part will break when any user switches to a different persistence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I did not understand that part of the code correctly. Followed the design of PolarisEventManager and created a PolarisMetricsManager. Modified PolarisMetastoreManager to implement PolarisMetricsManager. Then injected PolarisMetastoreManager and delegated persistence to that.

@obelix74
Copy link
Contributor Author

Re: question about generateIdempotentInsertQuery

The generateIdempotentInsertQuery method is needed for metrics persistence to handle potential duplicate metric reports gracefully. When a compute engine (like Spark) retries a request due to network issues, it may send the same metric report multiple times with the same report ID.

Using idempotent inserts (ON CONFLICT DO NOTHING for PostgreSQL, MERGE for H2) ensures:

  1. The first insert succeeds and persists the metric
  2. Subsequent duplicate inserts are silently ignored without errors
  3. No data corruption or duplicate entries in the metrics tables

This is different from regular entity persistence where we want conflicts to fail so we can detect concurrent modifications.

Anand Kumar Sankaran added 2 commits February 25, 2026 14:30
- Remove unnecessary comments and blank lines in JdbcMetaStoreManagerFactory
- Remove extra blank line in runtime/admin application.properties
- Update CHANGELOG.md to clarify configuration requirement
- Fix abstraction violation: use MetricsPersistence interface instead of
  JdbcBasePersistenceImpl in PersistingMetricsReporter
- Add setMetricsRequestContext() to MetricsPersistence interface with
  default no-op implementation
- Update tests to mock interface instead of concrete class
Address PR review comment #2855743843 by using PolarisMetaStoreManager
instead of directly accessing persistence, following the same pattern
as PolarisEventManager.

Changes:
- Add PolarisMetricsManager interface with default methods that delegate
  to MetricsPersistence when supported
- Extend PolarisMetaStoreManager with PolarisMetricsManager
- Update PersistingMetricsReporter to inject PolarisMetaStoreManager
  and use its writeScanMetrics/writeCommitMetrics methods
- Update tests to verify calls to PolarisMetaStoreManager
@obelix74 obelix74 requested a review from flyrain February 25, 2026 22:47
The NOOP constant and NoOpMetricsPersistence class are no longer used
after the refactoring to use PolarisMetricsManager, which silently
ignores metrics when the underlying persistence doesn't support them.
Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on it, @obelix74 ! Left some comments. I think we are one step closer.

effectiveSchemaVersion,
realm);
try {
// Run the set-up script to create the tables.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we may still keep this comment

this.callContext = callContext;
this.metaStoreManager = metaStoreManager;
this.polarisPrincipal = polarisPrincipal;
this.requestIdSupplier = requestIdSupplier;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we inject the bean?

Copy link
Contributor Author

@obelix74 obelix74 Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PersistingMetricsReporter bean is injected via CDI in ServiceProducers.java (line 434-439):

@Produces
@ApplicationScoped
public PolarisMetricsReporter metricsReporter(
MetricsReportingConfiguration config, @Any Instance<PolarisMetricsReporter> reporters) {
    return reporters.select(Identifier.Literal.of(config.type())).get();
 }

This producer method:

  1. Reads polaris.iceberg-metrics.reporting.type config (defaults to "default" via MetricsReportingConfiguration)
  2. Selects the PolarisMetricsReporter implementation that has the matching @Identifier annotation
  3. When type=persisting, it selects PersistingMetricsReporter (which has @Identifier("persisting"))

The reporter is then used by IcebergCatalogHandler.reportMetrics() to handle incoming metrics reports.

To enable persisting metrics, users would set:

polaris.iceberg-metrics.reporting.type=persisting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the line 78, where do we inject the requestIdSupplier bean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a producer in ServiceProducers, that seems to have been deleted by accident. I've added a @RequestScoped producer for RequestIdSupplier in ServiceProducers.java. It retrieves the request ID that RequestIdFilter stores on each incoming request via ContainerRequestContext.setProperty().

* @param callCtx the call context containing the persistence layer
* @return true if metrics can be persisted, false otherwise
*/
default boolean supportsMetricsPersistence(@Nonnull PolarisCallContext callCtx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not needed, can we remove it?

Anand Kumar Sankaran added 2 commits February 26, 2026 11:41
- Restore comment 'Run the set-up script to create the tables' in JdbcMetaStoreManagerFactory
- Remove unused supportsMetricsPersistence() method from PolarisMetricsManager
- Move SpiModelConverter methods to Model classes (ModelScanMetricsReport, ModelCommitMetricsReport)
- Delete SpiModelConverter.java - conversion logic now in Model classes following ModelEntity pattern
- Add RequestIdSupplier producer in ServiceProducers for request ID injection
- Update JdbcBasePersistenceImpl and tests to use new Model class methods
- Add principalName, requestId, otelTraceId, otelSpanId to MetricsRecordIdentity
- Remove setMetricsRequestContext() from MetricsPersistence SPI
- Simplify PolarisMetricsManager.write*() to take only (callCtx, record)
- Populate request context in PersistingMetricsReporter before calling manager
- Update MetricsRecordConverter builders with request context methods
- Simplify JdbcBasePersistenceImpl - no more member variables for context

Addresses reviewer feedback: single SPI method with complete record data
@obelix74 obelix74 requested a review from flyrain February 26, 2026 21:04
- BasePersistence now extends MetricsPersistence with default no-op methods
- Remove instanceof checks in PolarisMetricsManager
- Each implementation decides to override or use default no-op

Addresses review feedback: cleaner design without runtime type checking
* @param conflictColumns The columns that form the conflict/key constraint.
* @return INSERT query with value bindings that ignores conflicts.
*/
public static PreparedQuery generateIdempotentInsertQuery(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current conflict key will never cause conflicts, as report id is uuid.

  private static final List<String> METRICS_CONFLICT_COLUMNS = List.of("realm_id", "report_id");

Further more, I don't think we can actually easily distinguish two duplicated scan report. I'd suggest to remove it before we can actually figure a way to distinguish duplicated report.

@obelix74 obelix74 force-pushed the feat-3337-rest-catalog-metrics-table-merged branch from 00bb147 to 115921c Compare February 27, 2026 16:14
@obelix74 obelix74 force-pushed the feat-3337-rest-catalog-metrics-table-merged branch from 115921c to 7cfc20e Compare February 27, 2026 16:21
@obelix74 obelix74 requested a review from flyrain February 27, 2026 16:22
* .build();
* }</pre>
*/
public final class MetricsRecordConverter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to move this class to package persistence. metrics. Not a blocker though. We could refactor later.

// Test that bootstrap works without --include-metrics (default behavior)
public void testBootstrap(QuarkusMainLauncher launcher) {
// Test that bootstrap command works successfully.
// Metrics tables are now part of the base schema (v4) and bootstrapped automatically.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think line 51 isn't needed.


compileOnly("com.fasterxml.jackson.core:jackson-annotations")

compileOnly(project(":polaris-relational-jdbc"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this dependency?

…lational-jdbc

The admin module does not have any compile-time imports from
polaris-relational-jdbc. The runtimeOnly dependency is sufficient
since the module is resolved via CDI at runtime.
Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on it, @obelix74 !

@flyrain flyrain merged commit 483e131 into apache:main Feb 27, 2026
17 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants