-
Notifications
You must be signed in to change notification settings - Fork 586
Description
Bug Type (问题类型)
None
Before submit
- 我已经确认现有的 Issues 与 FAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)
Environment (环境信息)
- Server Version: v0.11.2
- Backend: RocksDB
- OS: Ubuntu 2x.x
- Data Size: like 1000W 点, 9000W 边
Expected & Actual behavior (期望与实际表现)
期望:进程重启时,hugegraph会尝试恢复之前存在的异步任务,恢复任务后,正常启动。
实际:
1、恢复之前存在的异步任务失败时,会中断系统启动,但是再次重启时,系统又因为异步任务恢复失败被中断启动,导致无法恢复。
2、异步任务恢复时,考虑了任务状态的优先级,但是提交时的处理顺序,存在部分任务重复调度的可能性。
异常堆栈
#tailf deploy-stderr.log
at org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:269)
at com.baidu.hugegraph.server.RestServer.start(RestServer.java:64)
at com.baidu.hugegraph.server.RestServer.start(RestServer.java:133)
at com.baidu.hugegraph.dist.HugeRestServer.start(HugeRestServer.java:58)
at com.baidu.hugegraph.dist.HugeGraphServer.(HugeGraphServer.java:55)
at com.baidu.hugegraph.dist.HugeGraphServer.main(HugeGraphServer.java:100)
Dec 15, 2023 4:19:32 PM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [0.0.0.0:9080]
Dec 15, 2023 4:19:33 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow
INFO: Stopped listener bound to [0.0.0.0:9080]
Exception in thread "main" java.lang.IllegalArgumentException: Task '37474' is already in the queue
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:163)
at com.baidu.hugegraph.util.E.checkArgument(E.java:56)
at com.baidu.hugegraph.task.StandardTaskScheduler.restore(StandardTaskScheduler.java:206)
at com.baidu.hugegraph.task.StandardTaskScheduler.restoreTasks(StandardTaskScheduler.java:194)
at com.baidu.hugegraph.StandardHugeGraph.serverStarted(StandardHugeGraph.java:223)
at com.baidu.hugegraph.auth.HugeGraphAuthProxy.serverStarted(HugeGraphAuthProxy.java:601)
at com.baidu.hugegraph.core.GraphManager.serverStarted(GraphManager.java:237)
at com.baidu.hugegraph.core.GraphManager.(GraphManager.java:78)
at com.baidu.hugegraph.server.ApplicationConfig$GraphManagerFactory$1.onEvent(ApplicationConfig.java:108)
at org.glassfish.jersey.server.internal.monitoring.CompositeApplicationEventListener.onEvent(CompositeApplicationEventListener.java:74)
at org.glassfish.jersey.server.internal.monitoring.MonitoringContainerListener.onStartup(MonitoringContainerListener.java:81)
at org.glassfish.jersey.server.ApplicationHandler.onStartup(ApplicationHandler.java:1180)
at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.start(GrizzlyHttpContainer.java:357)
at org.glassfish.grizzly.http.server.HttpHandlerChain.start(HttpHandlerChain.java:398)
at org.glassfish.grizzly.http.server.HttpServer.setupHttpHandler(HttpServer.java:293)
at org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:269)
at com.baidu.hugegraph.server.RestServer.start(RestServer.java:64)
at com.baidu.hugegraph.server.RestServer.start(RestServer.java:133)
at com.baidu.hugegraph.dist.HugeRestServer.start(HugeRestServer.java:58)
at com.baidu.hugegraph.dist.HugeGraphServer.(HugeGraphServer.java:55)
at com.baidu.hugegraph.dist.HugeGraphServer.main(HugeGraphServer.java:100)
潜在bug点,第一步捞出来的RESTORING类型的任务,在其被提交后,可能调度到并变成RUNNING状态,导致第二步查询RUNNING的任务列表时,包含第一步捞取的RESTORING任务,任务ID重复,导致系统启动失败。

Vertex/Edge example (问题点 / 边数据举例)
No response
Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)
No response