Skip to content

[Issue 3209] Surface Intermediate Data as part of Polaris Events#3456

Merged
HonahX merged 10 commits intoapache:mainfrom
adnanhemani:ahemani/solve_issue_3209
Jan 29, 2026
Merged

[Issue 3209] Surface Intermediate Data as part of Polaris Events#3456
HonahX merged 10 commits intoapache:mainfrom
adnanhemani:ahemani/solve_issue_3209

Conversation

@adnanhemani
Copy link
Contributor

Adds a RequestScoped annotation to AttributeMap, which allows us to inject this bean into intermediary business logic and be able to patch data from there back out to the PolarisEvent.

@dimas-b had a great idea that we may be able to retire the Delegator model eventually by setting a "closing" hook on the RequestScoped bean - but for simplicity, I am introducing this smaller change that will solve the GH issue linked first and will investigate that approach later.

Closes #3209.

Checklist

  • 🛡️ Don't disclose security issues! (contact [email protected])
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes Implement sending POJOs not within the Response to Events #3209
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • [N/A] 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

Copy link
Contributor

@evindj evindj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good to me, I left some questions mostly for my own understanding of the commit and event flow.

@dimas-b dimas-b requested a review from adutra January 16, 2026 19:33
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @adnanhemani ! The general direction of this PR LGTM... some comments below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we change all these new AttributeMap() occurrences to use the injected attribute map instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a different thought - but would like to hear your thoughts on this as well.

I was using AttributeMap to inject into the deeper parts of the codebase - but then using that to populate the final AttributeMap that gets passed to the Event Listener. The chief concern was to ensure that the After event attributes don't accidentally become strictly a super-set of the Before events attributes - and that cleaner attribute maps going into the event listener may help with filtering concerns that we will need to tackle in order to enable all events going to their eventual sink.

WDYT?

Copy link
Contributor

@dimas-b dimas-b Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EventAttributeMap is a proper CDI bean now. It should be ok to rely on CDI to create and pass around (inject) one instance of it per request.

If we want to distinguish before/after data, we could do that by code / methods inside EventAttributeMap. As I commented in another thread, I see value in evolving this class to become a rich and typed container of event-related data for each request (more than a simple key-value map).

Copy link
Contributor Author

@adnanhemani adnanhemani Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimas-b, if you're okay with it, let's handle this in a separate PR then? I think we may need to iterate on this idea a few times before we get it right - but this PR is attempting to do something very related but slightly orthogonal and IMO enough for a first take at this functionality.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not mind incremental work here, but please sync with @adutra in terms of timing.

Copy link
Contributor

@adutra adutra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is almost good to go imho, we just need to settle on the LOAD_TABLE_RESPONSES attribute.

@adnanhemani adnanhemani force-pushed the ahemani/solve_issue_3209 branch from 478e8df to 344a434 Compare January 23, 2026 22:46
Comment on lines 114 to 115
public static final AttributeKey<List<TableMetadata>> TABLE_METADATA =
new AttributeKey<>("table_metadatas", new TypeToken<List<TableMetadata>>() {});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name should reflect the attribute name: TABLE_METADATAS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be an inconsistency here: sometimes this attribute seems to be legitimately collecting a list of metadatas, cf. IcebergCatalogHandler. But sometimes, it's collecting a singleton list of a unique table metadata, cf. IcebergCatalogEventServiceDelegator.

I think it would be better to distinguish two attributes: TABLE_METADATA and TABLE_METADATAS. (I understand that the latter may become a sort of "internal" attribute that is never exposed to listeners – we could maybe declare those separately if we are concerned about that.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm glad to do this. I was trying to make everything fit within TABLE_METADATAS due to the similar comment that Dmitri had pointed out above. But I will make a comment for this for now so we can move past and revisit if there is a better way to do this later.

adutra
adutra previously approved these changes Jan 27, 2026
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jan 27, 2026
dimas-b
dimas-b previously approved these changes Jan 27, 2026
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @adnanhemani ! The PR LGTM 👍

@adnanhemani adnanhemani dismissed stale reviews from dimas-b and adutra via 256ba24 January 29, 2026 05:25
@adnanhemani
Copy link
Contributor Author

@adutra @dimas-b - I had to fix the merge conflicts :( Can you both please re-approve and merge?

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Thanks @adnanhemani for the work and @evindj @dimas-b @adutra for reviewing! Looks like all the comments are either resolved or planned to resolve incrementally. Merging.

@HonahX HonahX merged commit baf01a5 into apache:main Jan 29, 2026
15 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Jan 29, 2026
@adnanhemani adnanhemani deleted the ahemani/solve_issue_3209 branch January 29, 2026 21:46
sungwy pushed a commit to sungwy/polaris that referenced this pull request Feb 7, 2026
…che#3456)

Adds a RequestScoped annotation to AttributeMap, which allows us to inject this bean into intermediary business logic and be able to patch data from there back out to the PolarisEvent.

@dimas-b had a great idea that we may be able to retire the Delegator model eventually by setting a "closing" hook on the RequestScoped bean - but for simplicity, I am introducing this smaller change that will solve the GH issue linked first and will investigate that approach later.
snazy added a commit to snazy/polaris that referenced this pull request Feb 11, 2026
* Nit: fix shell-check warnings for await-s3.sh (apache#3593)

* Nit: add upgrade notes for develocity plugin (apache#3599)

* fix(deps): update dependency org.apache.hive:hive-exec to v2.3.10 (apache#3580)

* fix(deps): update dependency com.google.cloud:google-cloud-storage-bom to v2.62.1 (apache#3600)

* Change quick start example from MinIO to RustFS backend (apache#3586)

* chore(deps): update gradle/actions digest to f29f5a9 (apache#3602)

* Simplify getting started example for Ceph (apache#3587)

* Simplify getting started example for Ceph

* Simplify getting started example for Ceph

* fix(deps): update dependency io.smallrye.config:smallrye-config-core to v3.16.0 (apache#3604)

* chore(deps): update docker.io/mongo docker tag to v8.2.4 (apache#3606)

* fix(deps): update dependency net.ltgt.gradle:gradle-errorprone-plugin to v5 (apache#3605)

* chore(deps): update plugin com.gradle.develocity to v4.3.2 (apache#3598)

* chore(deps): update localstack/localstack docker tag to v4.13 (apache#3609)

* Guides: use `sql` code block (apache#3590)

Replaces the "anonymous" code blocks with `sql` code blocks, which is easier to "copy + paste" for users.

* Spark guide: document the launch command in the Markdown (apache#3592)

Replaces the "invisible" launch-docker.sh script with a "visible" `shell` code block, so the commands are immediately visible to users.

* Minio guide: fix service dependencies (apache#3594)

Use the "long" options for compose service dependencies, and adds the explicit option `restart: no` option for setup tasks, which is necessary to let docker-compose correctly interpret the termination of such tasks.

* Ozone guide: Fix service dependencies (apache#3595)

Use the "long" options for compose service dependencies, add healthchecks, uses more healthcheck-retries for slow-ish machines, and adds the explicit option `restart: no` option for setup tasks, which is necessary to let docker-compose correctly interpret the termination of such tasks.

* Rustfs guide: fix service dependencies (apache#3596)

Just adds a necessary service dependency

* Telemetry guide: fix setup-polaris service (apache#3597)

Fixes the polaris-setup service to use the "working" way/syntax, in alignment to other guides.

* Site-workflow: add `workflow_call` trigger (apache#3520)

This trigger allows the Hugo Site publication workflow to be called from the `versioned-docs` branch.

This change also introduces a concurrency group to prevent concurrent site publications.

Contributed to apache#3516

Co-authored-by: Dmitri Bourlatchkov <[email protected]>

* [Issue 3209] Surface Intermediate Data as part of Polaris Events (apache#3456)

Adds a RequestScoped annotation to AttributeMap, which allows us to inject this bean into intermediary business logic and be able to patch data from there back out to the PolarisEvent.

@dimas-b had a great idea that we may be able to retire the Delegator model eventually by setting a "closing" hook on the RequestScoped bean - but for simplicity, I am introducing this smaller change that will solve the GH issue linked first and will investigate that approach later.

* Getting-Started: Jaeger upgraded from V1 to V2 (apache#3603)

* Jaeger V2

* Jaeger V2

* Update Gradle to 9.3.1 (apache#3615)

* Last merged commit 90ba09c

---------

Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: Yong Zheng <[email protected]>
Co-authored-by: Dmitri Bourlatchkov <[email protected]>
Co-authored-by: Adnan Hemani <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement sending POJOs not within the Response to Events

5 participants