Skip to content

NoSQL: Add maintenance service to Polaris admin tool#3395

Merged
snazy merged 9 commits intoapache:mainfrom
snazy:nosql-maintenance-admin
Feb 24, 2026
Merged

NoSQL: Add maintenance service to Polaris admin tool#3395
snazy merged 9 commits intoapache:mainfrom
snazy:nosql-maintenance-admin

Conversation

@snazy
Copy link
Member

@snazy snazy commented Jan 9, 2026

This change wires the maintenance service up to the Polaris admin tool.

The new commands are grouped under the nosql parent command. This should provide enough isolation for end users who do not use NoSQL Persistence.

The initial NoSQL maintenance replies on CEL expressions for configuration. This will be reworked in follow-up PRs under #3847.

The Admin Tool remains a collection of utilities for low-level manipulation of the Polaris database, such as bootstrapping and maintenance. It targets admin users, so ordinary API / Catalog users are not expected to be affected by introducing NoSQL commands into the Admin CLI. Administrators are expected to be familiar with the multitude of Persistence backends and should not be confused by the new NoSQL-specific commands.

Another area for follow-up (related to modularizing the Admin CLI for better dependency management) is covered by #3855.

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 just a few minor comments 🙂

NoSqlMaintenanceRunCommand.class,
},
mixinStandardHelpOptions = true,
description = "Polaris persistence maintenance.")
Copy link
Contributor

@dimas-b dimas-b Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we flag it as "beta" as discussed in the last community sync call?.. or perhaps "preview" since the actual persistent storage that can be maintained via this command has not been hooked up yet (IIRC).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also probably shouldn't call it persistence maintenance unless we plan on having maintenance for other backends?

@CommandLine.Command(
name = "run",
mixinStandardHelpOptions = true,
description = {"Run Polaris persistence maintenance."})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe Run Polaris NoSQL persistence maintenance for the sake of clarity?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snazy : WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and similar description changes in the other new commands?

# - MongoDb - configure the via the Quarkus extension
polaris.persistence.nosql.backend=InMemory

## MongoDB version store specific configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MongoDB version store specific configuration defaults? I suppose the user can change any of these settings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments request here as on the service pr, These properties should probably be commented out by default since they aren't used

quarkus.mongodb.database=polaris
quarkus.mongodb.metrics.enabled=true
#quarkus.mongodb.connection-string=mongodb://localhost:27017
quarkus.mongodb.devservices.enabled=false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a runtime setting or build-time (the latter have a dedicated section)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per https://quarkus.io/guides/mongodb-dev-services quarkus.mongodb.devservices.enabled is a build-time config. Please move to the Build-time configuration section (lines 20-38) for the sake of consistency.


quarkus.arc.ignored-split-packages=\
org.apache.polaris.admintool.config,\
org.apache.polaris.admintool.maintenance,\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we refactor this code to avoid split packages upfront?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this package no longer exists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Package org.apache.polaris.admintool.nosql.maintenance only exists in one module, that is in runtime/admin... Is this config required anymore? 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I see different split package warnings in my build:

2026-02-03T00:51:08.714249614Z build-15 WARN Detected a split package usage which is considered a bad practice and should be avoided. Following packages were detected in multiple archives: 
- "org.apache.polaris.service.catalog.api" found in [org.apache.polaris:polaris-api-catalog-service:1.4.0-incubating-SNAPSHOT, org.apache.polaris:polaris-api-iceberg-service:1.4.0-incubating-SNAPSHOT]
- "org.apache.polaris.service.catalog.api.impl" found in [org.apache.polaris:polaris-api-catalog-service:1.4.0-incubating-SNAPSHOT, org.apache.polaris:polaris-api-iceberg-service:1.4.0-incubating-SNAPSHOT]
- "org.apache.polaris.service.types" found in [org.apache.polaris:polaris-api-catalog-service:1.4.0-incubating-SNAPSHOT, org.apache.polaris:polaris-api-iceberg-service:1.4.0-incubating-SNAPSHOT]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a commit to remove this prop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll deal with the other warnings on main later.

Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think the NoSQL maintenance tool belongs in the runtime admin module. NoSQL is just one persistence impl., and this maintenance logic is only applicable to that specific backend. Other persistence implementations do not need it, yet this change pulls NoSQL specific behavior into a shared admin layer. This creates unnecessary coupling, complicates the module boundaries, and makes the admin tooling harder to reason about and extend going forward.

cc @dennishuo @collado-mike @singhpk234 @jbonofre @HonahX @RussellSpitzer

@flyrain flyrain requested a review from dennishuo January 10, 2026 20:47
@snazy
Copy link
Member Author

snazy commented Jan 12, 2026

The admin-tool is meant for all persistence implementations, because it is a tool to perform administrative tasks. We have discussed before that other persistence specific operations like schema setup for JDBC would go into the admin-tool.
Adding another tool is rather going to complicate user experience.

@dimas-b
Copy link
Contributor

dimas-b commented Jan 12, 2026

@flyrain :

I do not think the NoSQL maintenance tool belongs in the runtime admin module. [...]

NoSQL persistence (by design) needs periodic maintenance by admin users. Not providing a tool for this will mean that no Apache Polaris user will be able to use NoSQL persistence efficiently.

As to the location of the tool, Admin CLI is a natural place as it hosts code for bootstrap and other administrative / setup tasks.

I believe this is the minimum the project community can do for OSS users.

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoSQL maintenance service introduces constructs which are not agreed upon in Apache Polaris, neither have ever been discussed and agreed upon the community. for example retention expression, dependency on an unmaintained project nessie library here mentioned here #3268 (review), other than question to where this should be in Polaris tools or Polaris repo has not been agreed upon, despite being called out.

It has been requested to start dev thread discussion to establish concencus for the same, despite that underlying PR was merged
#3268 (comment)

To avoid any confusion and explicitly mark my objection i am requesting changes, kindly please start thread establish concencus upon the semantics that Polaris should support .

@dimas-b
Copy link
Contributor

dimas-b commented Jan 14, 2026

out.println("use the 'help maintenance' command.");
out.println();

checkInMemory();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be failing for everything that isn't NoSql not just InMemory?

checkInMemory();

out.println();
out.println("Information: selected NoSql persistence backend: " + backend.type());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current formulation this isn't correct since you can call this with any backend right?

@Inject protected MaintenanceConfig maintenanceConfig;

protected void checkInMemory() {
if ("InMemory".equals(backend.type())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a check here for the persistence not being NoSQL?

@RussellSpitzer
Copy link
Member

As discussed on the mailing list, I think this is fine to add into the Admin tool but the current PR needs a bit of work to make it clear to end users that it is just for NoSQL.

import picocli.CommandLine;

@CommandLine.Command(
name = "nosql",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

RussellSpitzer
RussellSpitzer previously approved these changes Jan 22, 2026
Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1, I notice there are still some comments from @dimas-b though that are still not addressed?

@singhpk234 singhpk234 dismissed their stale review January 23, 2026 01:35

I can agree to disagree here, I don't need to agree here either (dismissing my request changes), as we definitely want to, imho this tool doesn't belong in admin module, specially strictly speaking from the MetaStorageManager interface POV bootstrap and purge are well defined in this manage interface. and admin tool documentation its for metastore management.

NoSqlMaintenanceRunCommand.class,
},
mixinStandardHelpOptions = true,
description = "Polaris NoSQL persistence.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggestion: Sub-commands specific to NoSQL persistence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this rewording

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty to push this update from my side. @snazy : I hope you do not mind.

import picocli.CommandLine;

@CommandLine.Command(
name = "maintenance-info",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe add maintenance as another sub-command group?

Current: java -jar ... nosql maintenance-info
Proposed: java -jar ... nosql maintenance info

I hope the latter in nicer in terms of UX.

quarkus.mongodb.database=polaris
quarkus.mongodb.metrics.enabled=true
#quarkus.mongodb.connection-string=mongodb://localhost:27017
quarkus.mongodb.devservices.enabled=false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per https://quarkus.io/guides/mongodb-dev-services quarkus.mongodb.devservices.enabled is a build-time config. Please move to the Build-time configuration section (lines 20-38) for the sake of consistency.

Comment on lines 77 to 78
quarkus.index-dependency.agrona.group-id=org.agrona
quarkus.index-dependency.agrona.artifact-id=agrona
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried building locally without these settings (with the workaround to get build warnings), but agrona was not flagged as needing an index... Is this setting required for the admin tool?

Copy link
Contributor

@dimas-b dimas-b Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-ran my local build after the recent rebase without this property and I did not see any build warnings about Agrona being unindexed.

@snazy : WDYT about removing this property?

@dimas-b
Copy link
Contributor

dimas-b commented Jan 31, 2026

Summarizing the dev discussion.

  • It looks like there is general consensus on adding NoSQL maintenance commands to the Admin Tool, scoped under a nosql top-level command group.

  • CEL appears to be regarded as excessive in terms of the flexibility and code maintenance it brings for the use cases it enables. I propose merging this PR as is (since it does not touch CEL directly) and converting NoSQL maintenance rules from CEL to java code + config in a follow up PR. I'll open a GH issue for this if other people of ok with this approach.

  • Further refactorings / renames are certainly possible and I invite ideas / proposals to be expressed on the dev thread (link above) or as GH issues.

@dimas-b
Copy link
Contributor

dimas-b commented Jan 31, 2026

@flyrain : Would you please reconsider your -1 on this PR given the above discussion?

@dimas-b
Copy link
Contributor

dimas-b commented Jan 31, 2026

@RussellSpitzer : just FYI: I re-requested your review my accident 😅 but fresh comments are welcome, of course 🙂

@dimas-b dimas-b added this to the 1.4.0 milestone Jan 31, 2026
@dimas-b
Copy link
Contributor

dimas-b commented Feb 2, 2026

@snazy : could you rebase this PR? I suspect some of the config concerns are related to the other NoSQL modules merged to main but not preset under this PR's git history 🤔

dimas-b
dimas-b previously approved these changes Feb 19, 2026
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this PR is good to merge based on the discussion in this PR and related dev email threads.

If anyone still sees disadvantages in merging it now, please send a quick recap of what you think those disadvantages might be (it is hard to track all discussion threads and I might have missed something).

Copy link
Contributor

@dennishuo dennishuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Belated) Thanks @dimas-b for the summary of the dev discussion in #3395 (comment) !

Based on that summary, I'm happy with the core content of the code in this PR to merge-and-iterate. To facilitate that, just a few things would be nice before merging to keep the most useful information intact for posterity:

  1. Can we update the PR description with the main takeaways from the dev discussion and also link to the dev discussion for posterity so that future folks who track down this PR when figuring things out can easily see the main reference materials without sifting through the really long PR comment threads?
  2. Can we add user-facing documentation along the same vein as https://github.com/apache/polaris/pull/3818/changes to actually see at a glance what the admin tool invocation will look like with the full nested subcommand and what it looks like to pass in arguments regarding retention time, etc? Normally I'd also expect the sample commands in the PR description of the PR that adds the new commands.
  3. If the example commands include usage of CEL, to make a note in that documentation that the "overly" expressive CEL syntax might be re-evaluated in the near future so the user shouldn't depend on it if possible (or else they should also join the dev discussion about the CEL functionality).
  4. Add a link to the aforementioned GH issue referred to in #3395 (comment) into the PR description or doc comments that explain the uncertain fate of CEL for easy navigation

Some of this request for docs follows from my comment in #3268 (review) - I understand the response there was to add docs "later", but since this PR is the one that actually establishes the user interaction model in terms of admin-tool commands, it's nice to pair the addition of the command with instructions for how it works.

NoSqlMaintenanceRunCommand.class,
},
mixinStandardHelpOptions = true,
description = "Polaris NoSQL persistence.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this rewording

},
mixinStandardHelpOptions = true,
description = "Polaris NoSQL persistence.")
public class NoSqlCommand extends BaseNoSqlCommand {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question for my own understanding, since I'm not too familiar with how the command hierarchy works - how does this "sibling" command end up causing the nesting discussed on the dev list? It looks like the hierarchy is:

NoSqlMaintenanceRunCommand extends BaseNoSqlMaintenanceCommand
BaseNoSqlMaintenanceCommand extends BaseNoSqlCommand
BaseNoSqlCommand extends BaseCommand

While

NoSqlCommand extends BaseNoSqlCommand

Thus appears to be a "sibling" of BaseNoSqlMaintenanceCommand rather than being a "parent".

Some documentation on the expected full example commands would also be helpful to reason about this; could we add some to the PR description and/or readme/md docs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through the code to answer my own question :)

I guess the subcommands tuple basically "intercepts" the other command classes rather than requiring those other command classes to know anything about being nested under a parent command?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, command nesting is derived from annotations. Implementation classes are free to extend other classes just to share code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimas-b is right.
Java type hierarchy is orthogonal to picocli commands / annotations.

@dimas-b
Copy link
Contributor

dimas-b commented Feb 20, 2026

@dennishuo : Here's the issue for CEL follow-up: #3847

Re: Admin CLI docs, would you mind if they were added after this PR? The latest docs pages are still useful even if they do not make it to the same release as the binaries.

@dimas-b
Copy link
Contributor

dimas-b commented Feb 20, 2026

@dennishuo : I updated the PR discription... Is it better? :)

@snazy
Copy link
Member Author

snazy commented Feb 20, 2026

Added some docs for the web site.
The reference there to the "Configuration Reference" should be updated either in #3838 or here (whichever gets merged first.)

Alternatively, you can run maintenance at high or the default unlimited scan rate during off-peak hours.

{{< alert note >}}
The `nosql maintenance-run` command may emit a warning that not all purgable database entried have been deleted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: entried -> entries

dimas-b
dimas-b previously approved these changes Feb 20, 2026
adutra
adutra previously approved these changes Feb 20, 2026
@dimas-b
Copy link
Contributor

dimas-b commented Feb 20, 2026

@dimas-b dimas-b dismissed stale reviews from adutra and themself via b37049b February 20, 2026 20:49
Copy link
Contributor

@dennishuo dennishuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the doc updates! LGTM

@dimas-b
Copy link
Contributor

dimas-b commented Feb 23, 2026

@flyrain : I opened #3867 to allow users flexibility over dependencies in the Admin CLI. I intend to work on that follow-up issue immediately after merging this PR.

Do you think we can go ahead with this plan? Thx!

@flyrain flyrain dismissed their stale review February 23, 2026 19:21

#3867 is filed to fix forward.

@snazy snazy merged commit b2978a4 into apache:main Feb 24, 2026
32 of 34 checks passed
@github-project-automation github-project-automation bot moved this from PRs In Progress to Done in Basic Kanban Board Feb 24, 2026
@snazy snazy deleted the nosql-maintenance-admin branch February 24, 2026 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants