[Fix][paimon-e2e] e2e test error #9467

WenDing-Y · 2025-06-18T13:37:54Z

Purpose of this pull request

fix #9466

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

If any new Jar binary package adding in your PR, please add License Notice according
New License Guide
If necessary, please update the documentation to describe the new feature. https://github.com/apache/seatunnel/tree/dev/docs
If you are contributing the connector code, please check that the following files are updated:
1. Update plugin-mapping.properties and add new connector information in it
2. Update the pom file of seatunnel-dist
3. Add ci label in label-scope-conf
4. Add e2e testcase in seatunnel-e2e
5. Update connector plugin_config

Copilot

Pull Request Overview

This PR fixes an e2e test error by updating configuration file syntax for paimon connectors and adjusting the test class for proper handling of multiple engine types.

Updated the row_rules configuration in two files by removing extra array wrappers.
Modified the PaimonIT test class to disable Spark and Flink engines and add startup initialization.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
seatunnel-e2e/seatunnel-connector-v2-e2e/connector-paimon-e2e/src/test/resources/paimon_to_assert.conf	Simplifies the row_rules configuration by removing extra brackets.
seatunnel-e2e/seatunnel-connector-v2-e2e/connector-paimon-e2e/src/test/resources/paimon_projection_to_assert.conf	Similar simplification applied to the row_rules configuration.
seatunnel-e2e/seatunnel-connector-v2-e2e/connector-paimon-e2e/src/test/java/org/apache/seatunnel/e2e/connector/paimon/PaimonIT.java	Adjusted test class to extend AbstractPaimonIT, implement TestResource, and disable engines with a clear reason.

Comments suppressed due to low confidence (2)

seatunnel-e2e/seatunnel-connector-v2-e2e/connector-paimon-e2e/src/test/resources/paimon_to_assert.conf:47

Confirm that consolidating the row_rules definition by removing the explicit array brackets aligns with the expected configuration schema and supports multiple rule entries if needed.

},

seatunnel-e2e/seatunnel-connector-v2-e2e/connector-paimon-e2e/src/test/resources/paimon_projection_to_assert.conf:48

Ensure that the revised configuration structure for row_rules without an explicit array wrapper is compatible with the configuration parser, especially in cases where multiple rules might be used.

},

Copilot · 2025-06-19T03:02:02Z

...e/connector-paimon-e2e/src/test/java/org/apache/seatunnel/e2e/connector/paimon/PaimonIT.java

+        value = {},
+        type = {EngineType.SPARK, EngineType.FLINK},
+        disabledReason =
+                "Spark and Flink engine can not auto create paimon table on worker node in local file(e.g flink tm) by savemode feature which can lead error")


Consider rephrasing the disabledReason to clearly explain the limitation with auto-creating paimon tables on local worker nodes (e.g., detailing the save mode issue) for better clarity to future maintainers.

Suggested change

"Spark and Flink engine can not auto create paimon table on worker node in local file(e.g flink tm) by savemode feature which can lead error")

"Spark and Flink engines cannot automatically create Paimon tables on local worker nodes due to limitations with the save mode feature. The save mode requires the table to exist beforehand, which can cause errors when running jobs on local file systems (e.g., Flink TaskManager).")

corgy-w · 2025-06-19T13:48:36Z

Hi Why do need to change this? This test seems to be able to run in flink spark

WenDing-Y · 2025-06-20T07:05:11Z

The reader read data actually is empty,the assert is error

WenDing-Y · 2025-06-20T07:09:32Z

The write data in task manager,but read split info in jobmanager

corgy-w · 2025-06-23T02:18:24Z

...e/connector-paimon-e2e/src/test/java/org/apache/seatunnel/e2e/connector/paimon/PaimonIT.java

+        value = {},
+        type = {EngineType.SPARK, EngineType.FLINK},
+        disabledReason =
+                "Spark and Flink engine can not auto create paimon table on worker node in local file(e.g flink tm) by savemode feature which can lead error")


It seems not necessary here

The fundamental reason is the lack of distributed storage. Why is the retrieved data empty? Because the data was written to the local node. The data write operation happens on the TaskManager node, while the JobManager is responsible for reading the split information. However, the JobManager cannot access the metadata of the data since there is no distributed storage system in place.

Another issue is why the assertion passes even though no data was read. This is because the assert statement was implemented incorrectly, allowing it to pass even when no data was actually retrieved.

It might be that there is a error with assert

Can we make jobmanager and taskmanager mount the same path to solve this problem?

I attempted to resolve the issue by mounting the same file path, however, the necessary changes would need to be made in the AbstractTestFlinkContainer class — a shared base class. Modifying it could affect multiple tests and introduces too much risk. As an alternative, I looked into how other test classes handle similar cases, and decided that skipping the Spark and Flink engines is the most practical solution.

@Override public void startUp() throws Exception { final String dockerImage = getDockerImage(); final String properties = String.join("\n", getFlinkProperties()); jobManager = new GenericContainer<>(dockerImage) .withCommand("jobmanager") .withNetwork(NETWORK) .withNetworkAliases("jobmanager") .withExposedPorts() .withEnv("FLINK_PROPERTIES", properties) .withLogConsumer( new Slf4jLogConsumer( DockerLoggerFactory.getLogger(dockerImage + ":jobmanager"))) .waitingFor( new LogMessageWaitStrategy() .withRegEx(".*Starting the resource manager.*") .withStartupTimeout(Duration.ofMinutes(2))); copySeaTunnelStarterToContainer(jobManager); copySeaTunnelStarterLoggingToContainer(jobManager); jobManager.setPortBindings(Lists.newArrayList(String.format("%s:%s", 8081, 8081))); taskManager = new GenericContainer<>(dockerImage) .withCommand("taskmanager") .withNetwork(NETWORK) .withNetworkAliases("taskmanager") .withEnv("FLINK_PROPERTIES", properties) .dependsOn(jobManager) .withLogConsumer( new Slf4jLogConsumer( DockerLoggerFactory.getLogger( dockerImage + ":taskmanager"))) .waitingFor( new LogMessageWaitStrategy() .withRegEx( ".*Successful registration at resource manager.*") .withStartupTimeout(Duration.ofMinutes(2))); Startables.deepStart(Stream.of(jobManager)).join(); Startables.deepStart(Stream.of(taskManager)).join(); // execute extra commands executeExtraCommands(jobManager); }

It should be possible to add a shared path address that other programs will not use. For example, /opt/seatunnel

Hisoka-X · 2025-06-24T04:02:26Z

cc @hawk9821

Hisoka-X · 2025-06-24T04:03:31Z

...ctor-v2-e2e/connector-paimon-e2e/src/test/resources/fake_to_dynamic_bucket_paimon_case1.conf

-      primaryKey {
-          name = "pk_id"
-          columnNames = [pk_id]
-        }


why remove this?

The pk_id in the generated fake data is randomly created, which may lead to duplicate entries. Due to the primary key constraint, the actual amount of inserted data may be less than the target. The reason it passed the previous assertion is because the assertion was written incorrectly.

make sense to me. cc @hawk9821

data generated by fakesource may indeed be repetitive. 0 - Integer.MAX_VALUE the probability of repetition is relatively low, so this case can run correctly. I think to ensure absolute certainty, we can enable fakesource to support the capability of auto-incrementing primary keys. cc @WenDing-Y @Hisoka-X

Hisoka-X · 2025-06-24T04:08:22Z

...e/connector-paimon-e2e/src/test/java/org/apache/seatunnel/e2e/connector/paimon/PaimonIT.java

+        value = {},
+        type = {EngineType.SPARK, EngineType.FLINK},
+        disabledReason =
+                "Spark and Flink engine can not auto create paimon table on worker node in local file(e.g flink tm) by savemode feature which can lead error")


Can we make jobmanager and taskmanager mount the same path to solve this problem?

Hisoka-X

LGTM. Thanks @WenDing-Y

WenDing-Y added 3 commits June 18, 2025 21:30

[Fix][E2E] Fixed e2e test cases of Paimon did not work as expected

0a4e6cc

Update paimon_to_assert.conf

684b7d2

Update paimon_projection_to_assert.conf

30b08a6

github-actions bot added the e2e label Jun 18, 2025

WenDing-Y added 2 commits June 18, 2025 21:39

Update paimon_to_assert.conf

ef29aaf

Update paimon_projection_to_assert.conf

2534995

WenDing-Y changed the title ~~Fix paimon e2e test error~~ [Fix][paimon-e2e] e2e test error Jun 18, 2025

nielifeng requested a review from Copilot June 19, 2025 03:01

Copilot AI reviewed Jun 19, 2025

View reviewed changes

Update fake_to_dynamic_bucket_paimon_case1.conf

7e5e98d

corgy-w reviewed Jun 23, 2025

View reviewed changes

WenDing-Y requested a review from corgy-w June 23, 2025 03:32

Hisoka-X reviewed Jun 24, 2025

View reviewed changes

WenDing-Y added 7 commits June 25, 2025 23:35

Update AbstractTestFlinkContainer.java

8cf0f1f

Update PaimonIT.java

c8e76cc

Update fake_to_paimon.conf

6d1903d

Update paimon_to_assert.conf

9905651

Update paimon_projection_to_assert.conf

3ef86a7

Update fake_to_dynamic_bucket_paimon_case1.conf

2777f14

[Fix][paimon-e2e] add delete table

67c7309

WenDing-Y requested a review from Hisoka-X June 27, 2025 14:34

Hisoka-X approved these changes Jun 28, 2025

View reviewed changes

github-actions bot added approved reviewed labels Jun 28, 2025

Hisoka-X mentioned this pull request Jun 28, 2025

[Feature][Connector-V2] Support like predicate pushdown in paimon #9484

Merged

3 tasks

corgy-w approved these changes Jun 28, 2025

View reviewed changes

corgy-w merged commit 5b700a8 into apache:dev Jun 28, 2025
4 checks passed

WenDing-Y deleted the fix-paimon-e2e branch July 1, 2025 13:49

dybyte pushed a commit to dybyte/seatunnel that referenced this pull request Jul 23, 2025

[Fix][paimon-e2e] e2e test error (apache#9467)

3d26012

	"Spark and Flink engine can not auto create paimon table on worker node in local file(e.g flink tm) by savemode feature which can lead error")
	"Spark and Flink engines cannot automatically create Paimon tables on local worker nodes due to limitations with the save mode feature. The save mode requires the table to exist beforehand, which can cause errors when running jobs on local file systems (e.g., Flink TaskManager).")

[Fix][paimon-e2e] e2e test error #9467

[Fix][paimon-e2e] e2e test error #9467

Conversation

WenDing-Y commented Jun 18, 2025

Purpose of this pull request

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

corgy-w commented Jun 19, 2025

Uh oh!

WenDing-Y commented Jun 20, 2025

Uh oh!

WenDing-Y commented Jun 20, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Hisoka-X commented Jun 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hawk9821 Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Hisoka-X left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hawk9821 Jun 26, 2025 •

edited

Loading