Skip to content

Added interface for reporting metrics#2887

Merged
adutra merged 23 commits intoapache:mainfrom
cccs-cat001:main
Nov 5, 2025
Merged

Added interface for reporting metrics#2887
adutra merged 23 commits intoapache:mainfrom
cccs-cat001:main

Conversation

@cccs-cat001
Copy link
Contributor

What changes were proposed in this pull request?

I'm adding an interface to be able to code for the /metrics endpoint in the Iceberg REST API Spec.

Why are the changes needed?

The metrics endpoint in Polaris is currently a no-op endpoint. In our organization we use it for auditing who is reading/writing tables, as this endpoint is called after every read/write with Iceberg. I added a default no-op implementation so that the behaviour remains the same, and I added a logging implementation to show how it could be used.

Does this PR introduce any user-facing change?

it adds an optional configuration for telling the metrics endpoint how it should behave. the default behaviour is no change, however if the user sets the polaris.metrics.reporting.type to logging it will log the MetricsReport to INFO. There is also the ability to implement the interface for your own needs if necessary.

How was this patch tested?

Tests were added. it was also tested in a staging environment.

CHANGELOG.md

Copy link
Contributor

@adutra adutra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this contribution @cccs-cat001! I left a few comments mostly around API design.

@Override
public void reportMetric(
String prefix, String namespace, String table, ReportMetricsRequest reportMetricsRequest) {
// Do Nothing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Would it make sense to define and maintain a set of Micrometer metrics and increment them here?
  2. Would it make sense to fire a PolarisEvent here?

(I don't really know what an IRC server is supposed to do with these scan or commit metrics 😅)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what those are 😅 my intentions were to have the default behaviour remain the same, and adding that may change the default behaviour

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄 - no worries, we can look into that later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recommendation was to store them in persistence, so that later we can serve them later, I recommend the same to @pingtimeout https://lists.apache.org/thread/koogjvb73hbl9ospmm7owsdx8oh67wk1

Copy link
Contributor

@adutra adutra Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we would need to purge that table frequently. Not sure if it's a good default to store in persistence without providing the purge functionality. Could that be an alternate CDI bean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, though It can be any persistence, not necessarily an OLTP imho an OLAP is a valid consideration too, so is dumping in object storge so analytical engines can make a table on top of from Polaris POV, it seems to me that we are settling on catalog name and tble identifiers in addition to report from interface POV, why is realm-id not part of it is my doubt, why is authenticate prinicipals not part of these metric ?

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your interest in Polaris and contributing, @cccs-cat001 !

I'm not sure how this PR relates to the /metrics endpoint in the Iceberg REST API, though 🤔 Could you clarify, please?

@RequestScoped
public MetricsReporter metricsReporter(
MetricsReportingConfiguration config, @Any Instance<MetricsReporter> reporters) {
return reporters.select(Identifier.Literal.of(config.type())).get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config.type() is fixed in runtime... Can this bean be @ApplicationScoped?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This producer is annotated @ApplicationScoped but it returns a PolarisMetricsReporter, which is @RequestScoped. That’s problematic in CDI because an @ApplicationScoped bean lives for the entire application lifecycle, while a @RequestScoped bean only lives for a single request.
If the reporter (or a future implementation) ever injects request-scoped beans like RealmContext or SecurityContext, this setup would either fail at deployment or produce invalid references once the request ends.

Instead of producing a reporter directly in an app-scoped producer, introduce a small factory:

@ApplicationScoped
public class PolarisMetricsReporterFactory {

  public PolarisMetricsReporter getOrCreate(String type) {

  }
}

Similar to MetaStoreManagerFactory, which ensure the scope correct, while avoid creating a new object for every request. It is also more flexible when we introduce more reporter types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A factory is ApplicationScoped(or Singleton), but a bean can be ApplicationScoped or RequestScoped:

  • Should the bean exist only during the lifespan of a request? If so, use @RequestScoped.
  • Should the bean exist during the lifespan of the application and should the state of the bean be shared between all the requests? If so, use @ApplicationScoped.

So, to use ApplicationScoped here, a factory is probably clearer.

Copy link
Contributor

@adutra adutra Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm -1 on the factory approach, The factory approach in Polaris is generally used for creating and caching realm-scoped beans. The producer method approach OTOH is used for selecting beans at runtime according to their @Identifier.

Here, it's clearly an application-scoped bean, selectable at runtime with its @Identifier.

The correct fix for me is just to replace @RequestScoped with @ApplicationScoped in the producer method:

@ApplicationScoped
public MetricsReporter metricsReporter(...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct fix for me is just to replace @RequestScoped with @ApplicationScoped in the producer method:

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cccs-cat001 please change the scope of the DefaultMetricsReporter bean to also @ApplicationScoped – there is no reason for it to be request-scoped.

Once this change is done we should be good 🤞

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, given this producer method, I do not think DefaultMetricsReporter needs a scope annotation at all. It is always created via this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I see, I think you're right and we don't need the annotation. Other examples with @ApplicationScope in the ServiceProducer follow the same pattern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, given this producer method, I do not think DefaultMetricsReporter needs a scope annotation at all. It is always created via this method.

Yes and no :-) It doesn't need the annotation but 1) all other beans in the same situation are annotated and 2) annotating the bean allows for direct injection, e.g. in a test:

@Inject @Identifier("default") MetricsReporter defaultMetricsReporter;

So, I'm rather in favor of anotating both places, even if I understand that this can cause some confusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotating both places SGTM 👍 Using @ApplicationScoped, right?

@adutra
Copy link
Contributor

adutra commented Oct 24, 2025

I'm not sure how this PR relates to the /metrics endpoint in the Iceberg REST API, though 🤔 Could you clarify, please?

If that can help: I was also puzzled initially, but we are actually talking about this IRC endpoint:

/v1/{prefix}/namespaces/{namespace}/tables/{table}/metrics:

Copy link
Contributor

@adutra adutra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! Just a few minor issues.

cccs-cat001 and others added 5 commits October 28, 2025 12:32
@cccs-cat001 cccs-cat001 requested a review from adutra October 28, 2025 15:59
@cccs-cat001
Copy link
Contributor Author

I'm staring at the logs of the two failed checks and I'm not really sure what the cause of the errors are. Do you have any advice? 😅

@adutra
Copy link
Contributor

adutra commented Oct 29, 2025

I'm staring at the logs of the two failed checks and I'm not really sure what the cause of the errors are. Do you have any advice? 😅

You have 4 files without license header:

runtime/service/src/test/java/org/apache/polaris/service/reporting/DefaultMetricsReporterTest.java
runtime/service/src/main/java/org/apache/polaris/service/reporting/DefaultMetricsReporter.java
runtime/service/src/main/java/org/apache/polaris/service/reporting/MetricsReportingConfiguration.java
runtime/service/src/main/java/org/apache/polaris/service/reporting/PolarisMetricsReporter.java

The other error occurred while pulling a Docker image. It's a transient error.

@adutra
Copy link
Contributor

adutra commented Oct 30, 2025

@cccs-cat001 One last thing and we are good to go: do you mind adding this blurb to the CHANGELOG.md file?

### Highlights

- Support for [Iceberg Metrics Reporting] has been introduced in Polaris. Out of the box, metrics can
  be printed to the logs by setting the `org.apache.polaris.service.reporting` logger level to `INFO` (it's
  set to `OFF` by default). Custom reporters can be implemented and configured to send metrics to
  external systems for further analysis and monitoring.

[Iceberg Metrics Reporting]: https://iceberg.apache.org/docs/latest/metrics-reporting/

@cccs-cat001 cccs-cat001 requested review from adutra and dimas-b October 30, 2025 16:44
dimas-b
dimas-b previously approved these changes Oct 30, 2025
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bearing with me @cccs-cat001 🙂

@dimas-b
Copy link
Contributor

dimas-b commented Oct 30, 2025

As @adutra noted, let's wait at least until tomorrow for other people to have a chance at reviewing.

Side note: we may want to add Authorization checks for metric reporting (in a follow-up PR, of course).

@dimas-b
Copy link
Contributor

dimas-b commented Nov 3, 2025

Are we ok to merge? I see a lot of recent comments resolved, but no fresh approvals 😅

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

@RequestScoped
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this scope annotation if this is meant to be application scoped? More details are here, https://github.com/apache/polaris/pull/2887/files#r2487347077

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! Good catch!

adutra
adutra previously approved these changes Nov 4, 2025
dimas-b
dimas-b previously approved these changes Nov 4, 2025
@adutra
Copy link
Contributor

adutra commented Nov 4, 2025

The current CI failure actually demonstrates that we must annotate the CDI bean with something – otherwise, it's not a CDI bean and so it's not included in the alternatives passed to the producer method.

TLDR: DefaultMetricsReport must be annotated with @ApplicationScoped.

cc @cccs-cat001 @dimas-b

@dimas-b
Copy link
Contributor

dimas-b commented Nov 4, 2025

TIL: re: bean annotations 😅 Like I commented on the specific thread (above) I'm ok with annotating both the bean and the producer with @ApplicationScoped

@cccs-cat001 cccs-cat001 dismissed stale reviews from dimas-b and adutra via dcacc5c November 4, 2025 21:17
@cccs-cat001
Copy link
Contributor Author

Okay final answer (hopefully!!!)

Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks a lot for adding this, @cccs-cat001 !

import org.apache.iceberg.catalog.TableIdentifier;
import org.apache.iceberg.metrics.MetricsReport;

public interface PolarisMetricsReporter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker: would the name IcebergMetricsReporter be more descriptive as this is only for iceberg tables, other tables managed by Polaris won't be able to report the metrics?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. But let's merge this PR (the author has been kind enough to address our collective feedback for two weeks now) and address this one in a follow-up task.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was another TODO mentioned here #2887 (comment)

If my time allows I'd be happy to help contribute that, and if we can get the requirements defined beforehand :)

@adutra adutra merged commit 361b7e9 into apache:main Nov 5, 2025
30 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Nov 5, 2025
@adutra
Copy link
Contributor

adutra commented Nov 5, 2025

Thanks you so much @cccs-cat001 for your contribution – and your patience 😅

XN137 pushed a commit to XN137/polaris that referenced this pull request Nov 6, 2025
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* fix typo in Ozone getting-started guide (apache#2975)

* NoSQL: Persistence API (apache#2965)

Provides the persistence API parts for the NoSQL persistence work. Most of this PR are purely interfaces and annotations.

It consists of a low-level `Persistence` interface to read and write individual `Obj`ects and the corresponding pluggable `ObjType`s.

The API module also contains upstream SPIs for database specific implementations and downstream APIs for atomic commits and indexes.
Also some CDI specific infrastructure helper annotations.

Unit tests cover the few pieces of actual executable code in this change.

This change also adds a README, which references functionality and modules that are not in this PR, but already provide an overview of the overall interactions.

* Handle poetry lock update (apache#2942)

* Update dependency com.azure:azure-sdk-bom to v1.3.2 (apache#2979)

* Added interface for reporting metrics (apache#2887)

Co-authored-by: Alexandre Dutra <[email protected]>

* Prefer RealmConfig fields (apache#2971)

minor cleanup of verbose code

* Update community meetings note links due to document split (apache#2981)

* NoSQL: database agnostic implementation + in-memory backend (apache#2977)

This change contains the database agnostic implementation plus the in-memory backend used for testing purposes, and a Junit extension.

These three modules are difficult to put into isolated PRs.

The "main implementation" contains the commit-logic, indexes-logic and the caching part.
`PersistenceImplementation` is (more or less) a wrapper providing higher-level functionality backed by a database's `Backend` implementation. The latter provides the bare minimum functionality.
Other implementations of the `Persistence` interface are just to transparently add caching and commit-attempt specific case.
No call site needs to bother about the actual implementation and/or its layers.

Tests in the `polaris-persistence-nosql-impl` module use the in-memory backend via the Junit extension.
Common tests for all backends, in-memory in this PR and MongoDB in a follow-up, are in the testFixtures of the `polaris-persistence-nosql-impl`.

* Fix incorrect column name in ModelEvent (apache#2987)

Fixes apache#2913

* Rename request ID header (apache#2988)

* Update dependency org.agrona:agrona to v2.3.1 (apache#2986)

* Update actions/stale digest to fad0de8 (apache#2984)

* Last merged commit 291bd7d

---------

Co-authored-by: Dmitri Bourlatchkov <[email protected]>
Co-authored-by: Yong Zheng <[email protected]>
Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: cccs-cat001 <[email protected]>
Co-authored-by: Alexandre Dutra <[email protected]>
Co-authored-by: Christopher Lambert <[email protected]>
Co-authored-by: JB Onofré <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants