Skip to content

[Bug] streampark delete k8s deployment when JobManager restart #3423

@zhenyuT

Description

@zhenyuT

Search before asking

  • I had searched in the issues and found no similar issues.

Java Version

java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

Scala Version

2.12.x

StreamPark Version

apache-streampark_2.12-2.1.1-incubating

Flink Version

1.15.3

deploy mode

kubernetes-application

What happened

flink on k8s高可用采用zk,通过故障注入工具模拟JobManager访问zk节点超时(网络延时30s,持续120s后恢复)
通过命令行方式启动任务,JobManager会自动restart,网络延时故障结束后任务能自动恢复正常
通过streampark方式启动,JobManager重新启动的时候,突然整个deployment都被remove

查看streampark日志,发现是streampark监听到flink任务fail,触发了delete deployment操作
1703233420947_125D8297-C2B0-462f-8A04-170833A89075
1703233457034_5E41A593-3067-4eb3-A168-2D5A182D96DC

个人认为除非用户手动需要删除任务的情况,streampark不应该从外部去强行删除deployment

Error Exception

No response

Screenshots

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!(您是否要贡献这个PR?)

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions