Skip to content

[Bug] HStore partition leader changes cause P99 latency jitter in the cluster's read and write operations #2763

@JackyYangPassion

Description

@JackyYangPassion

Bug Type (问题类型)

None

Before submit

  • 我已经确认现有的 IssuesFAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

  • Server Version: 1.5.0 (Apache Release Version)
  • Backend: HStore, SSD
  • OS: 32 CPUs, 128 G RAM, CentOS 7
  • Data Size: 1 billion vertices, 3 billion edges

Expected & Actual behavior (期望与实际表现)

Monitor

read

Image

write

Image

从监控看 节点A 出现了问题

HugeServer Log 节点A

server.log

HStore Log

Store.log

PD Log

2025-05-09 04:53:53 [grpc-default-worker-ELG-3-5] [INFO] i.g.n.s.i.g.n.N.connections - Transport failed
io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2Exception: HTTP/2 client preface string missing or corrupt. Hex dump for received bytes: 16030101300100012c03037d2ea24fb94c70eca27879a1b7
        at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2Exception.connectionError(Http2Exception.java:103) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.readClientPrefaceString(Http2ConnectionHandler.java:306) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.decode(Http2ConnectionHandler.java:239) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:438) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at java.lang.Thread.run(Thread.java:834) ~[?:?]
2025-05-09 04:54:03 [grpc-default-worker-ELG-3-7] [INFO] i.g.n.s.i.g.n.N.connections - Transport failed
io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2Exception: Unexpected HTTP/1.x request: GET /
        at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2Exception.connectionError(Http2Exception.java:103) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.readClientPrefaceString(Http2ConnectionHandler.java:302) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.decode(Http2ConnectionHandler.java:239) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:438) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[grpc-netty-shaded-1.39.0.jar!/:1.39.0]
        at java.lang.Thread.run(Thread.java:834) ~[?:?]

Metadata

Metadata

Labels

bugSomething isn't workinggood first issueGood for newcomers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions