-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Sarama has stuck on the batch with zero records length. #2053
Description
Versions
Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.
| Sarama | Kafka | Go |
|---|---|---|
| Version 1.29.1 | AWS Managed Kafka | 1.17 |
Configuration
What configuration values are you using for Sarama and Kafka?
cfg.Net.MaxOpenRequests = 1
// consume all messages
cfg.Consumer.IsolationLevel = sarama.ReadUncommitted
// require producer idempotency
cfg.Producer.Idempotent = true
// require consistency on Kafka side before considering the message produced
cfg.Producer.RequiredAcks = sarama.WaitForAll
// specify network keep-alive to reduce 'broken pipe' errors on non-KMS environments
cfg.Net.KeepAlive = time.Second * 30
// increase processing time from 1s (default) to 5s
cfg.Consumer.MaxProcessingTime = time.Second * 5Logs
When filing an issue please provide logs from Sarama and Kafka if at all
possible. You can set sarama.Logger to a log.Logger to capture Sarama debug
output.
logs: CLICK ME
2021/10/22 10:38:26.384689 lvl=INF component=kafka records batch &{5805968 49 2 snappy 0 false false 0 2021-09-30 11:45:34.617 +0200 CEST 2021-09-30 11:45:34.617 +0200 CEST 82609 0 1563 [0xc00030e150] false false [] 1326}
2021/10/22 10:38:26.384705 lvl=INF component=kafka records batch &{5806083 49 2 snappy 0 false false 0 2021-09-30 12:00:01.188 +0200 CEST 2021-09-30 12:00:01.188 +0200 CEST 82609 0 1678 [0xc00030e1c0] false false [] 733}
2021/10/22 10:38:26.384728 lvl=INF component=kafka records batch &{5806313 49 2 snappy 0 false false 0 2021-09-30 12:29:55.542 +0200 CEST 2021-09-30 12:29:55.542 +0200 CEST 82609 0 1908 [0xc00030e230] false false [] 10546}
2021/10/22 10:38:26.384751 lvl=INF component=kafka records batch &{5806399 49 2 snappy 0 false false 0 2021-09-30 12:42:07.755 +0200 CEST 2021-09-30 12:42:07.755 +0200 CEST 82609 0 1994 [0xc00030e2a0] false false [] 1130}
2021/10/22 10:38:26.384772 lvl=INF component=kafka records batch &{5806689 49 2 snappy 0 false false 0 2021-09-30 13:25:25.119 +0200 CEST 2021-09-30 13:25:25.119 +0200 CEST 82609 0 2284 [0xc00030e380] false false [] 2934}
2021/10/22 10:38:26.384790 lvl=INF component=kafka records batch &{5807057 49 2 snappy 0 false false 0 2021-09-30 14:24:12.229 +0200 CEST 2021-09-30 14:24:12.229 +0200 CEST 82609 0 2652 [0xc00030e3f0] false false [] 3124}
2021/10/22 10:38:26.384807 lvl=INF component=kafka records batch &{5807106 49 2 snappy 0 false false 0 2021-09-30 14:31:33.337 +0200 CEST 2021-09-30 14:31:33.337 +0200 CEST 82609 0 2701 [0xc00030e460] false false [] 2335}
2021/10/22 10:38:26.384824 lvl=INF component=kafka records batch &{5807150 49 2 snappy 0 false false 0 2021-09-30 14:38:49.18 +0200 CEST 2021-09-30 14:38:49.18 +0200 CEST 82609 0 2745 [0xc00030e4d0] false false [] 2917}
2021/10/22 10:38:26.384858 lvl=INF component=kafka records batch &{5807374 49 2 snappy 0 false false 0 2021-09-30 15:12:07.587 +0200 CEST 2021-09-30 15:12:07.587 +0200 CEST 82609 0 2969 [0xc00030e540] false false [] 5744}
2021/10/22 10:38:26.384933 lvl=INF component=kafka records batch &{5807388 49 2 snappy 0 false false 0 2021-09-30 15:14:10.358 +0200 CEST 2021-09-30 15:14:10.358 +0200 CEST 82609 0 2983 [0xc00030e5b0] false false [] 31539}
2021/10/22 10:38:26.384958 lvl=INF component=kafka records batch &{5807391 49 2 snappy 0 false false 0 2021-09-30 15:14:44.664 +0200 CEST 2021-09-30 15:14:44.664 +0200 CEST 82609 0 2986 [0xc00030e620] false false [] 2135}
2021/10/22 10:38:26.385027 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.418552 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.450115 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.481958 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.522805 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.557499 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.593690 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.625917 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.657295 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.775601 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.806612 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.840360 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.871158 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
Problem Description
I'm using Goka (Sarama based library) and have some problems with reading from the one partition of 4. The consumer has stuck on the one particular batch and rereads it again and again.
Here is some info:
This is batch struct:
type RecordBatch struct {
FirstOffset int64
PartitionLeaderEpoch int32
Version int8
Codec CompressionCodec
CompressionLevel int
Control bool
LogAppendTime bool
LastOffsetDelta int32
FirstTimestamp time.Time
MaxTimestamp time.Time
ProducerID int64
ProducerEpoch int16
FirstSequence int32
Records []*Record
PartialTrailingRecord bool
IsTransactional bool
compressedRecords []byte
recordsLen int // uncompressed records size
}I added some logs for all decoded batches and firstly I see this:
2021/10/22 10:38:26.384671 lvl=INF component=kafka records batch &{5805856 49 2 snappy 0 false false 0 2021-09-30 11:29:46.662 +0200 CEST 2021-09-30 11:29:46.662 +0200 CEST 82609 0 1451 [0xc00030e0e0] false false [] 1735}
2021/10/22 10:38:26.384689 lvl=INF component=kafka records batch &{5805968 49 2 snappy 0 false false 0 2021-09-30 11:45:34.617 +0200 CEST 2021-09-30 11:45:34.617 +0200 CEST 82609 0 1563 [0xc00030e150] false false [] 1326}
2021/10/22 10:38:26.384705 lvl=INF component=kafka records batch &{5806083 49 2 snappy 0 false false 0 2021-09-30 12:00:01.188 +0200 CEST 2021-09-30 12:00:01.188 +0200 CEST 82609 0 1678 [0xc00030e1c0] false false [] 733}
2021/10/22 10:38:26.384728 lvl=INF component=kafka records batch &{5806313 49 2 snappy 0 false false 0 2021-09-30 12:29:55.542 +0200 CEST 2021-09-30 12:29:55.542 +0200 CEST 82609 0 1908 [0xc00030e230] false false [] 10546}
2021/10/22 10:38:26.384751 lvl=INF component=kafka records batch &{5806399 49 2 snappy 0 false false 0 2021-09-30 12:42:07.755 +0200 CEST 2021-09-30 12:42:07.755 +0200 CEST 82609 0 1994 [0xc00030e2a0] false false [] 1130}
2021/10/22 10:38:26.384772 lvl=INF component=kafka records batch &{5806689 49 2 snappy 0 false false 0 2021-09-30 13:25:25.119 +0200 CEST 2021-09-30 13:25:25.119 +0200 CEST 82609 0 2284 [0xc00030e380] false false [] 2934}
2021/10/22 10:38:26.384790 lvl=INF component=kafka records batch &{5807057 49 2 snappy 0 false false 0 2021-09-30 14:24:12.229 +0200 CEST 2021-09-30 14:24:12.229 +0200 CEST 82609 0 2652 [0xc00030e3f0] false false [] 3124}
2021/10/22 10:38:26.384807 lvl=INF component=kafka records batch &{5807106 49 2 snappy 0 false false 0 2021-09-30 14:31:33.337 +0200 CEST 2021-09-30 14:31:33.337 +0200 CEST 82609 0 2701 [0xc00030e460] false false [] 2335}
2021/10/22 10:38:26.384824 lvl=INF component=kafka records batch &{5807150 49 2 snappy 0 false false 0 2021-09-30 14:38:49.18 +0200 CEST 2021-09-30 14:38:49.18 +0200 CEST 82609 0 2745 [0xc00030e4d0] false false [] 2917}
2021/10/22 10:38:26.384858 lvl=INF component=kafka records batch &{5807374 49 2 snappy 0 false false 0 2021-09-30 15:12:07.587 +0200 CEST 2021-09-30 15:12:07.587 +0200 CEST 82609 0 2969 [0xc00030e540] false false [] 5744}
2021/10/22 10:38:26.384933 lvl=INF component=kafka records batch &{5807388 49 2 snappy 0 false false 0 2021-09-30 15:14:10.358 +0200 CEST 2021-09-30 15:14:10.358 +0200 CEST 82609 0 2983 [0xc00030e5b0] false false [] 31539}
2021/10/22 10:38:26.384958 lvl=INF component=kafka records batch &{5807391 49 2 snappy 0 false false 0 2021-09-30 15:14:44.664 +0200 CEST 2021-09-30 15:14:44.664 +0200 CEST 82609 0 2986 [0xc00030e620] false false [] 2135}
Looks ok. But then I see:
2021/10/22 10:38:26.385027 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.418552 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.450115 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.481958 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.522805 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.557499 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.593690 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.625917 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.657295 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.775601 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.806612 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.840360 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.871158 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.901792 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
2021/10/22 10:38:26.934249 lvl=INF component=kafka records batch &{5807565 49 2 none 0 false false 0 0001-01-01 00:00:00 +0000 UTC 2021-09-30 15:43:16.281 +0200 CEST 82609 0 3160 [] false false [] 0}
and it keeps going indefinitely.
I dug into the code and figured out that there is no exit from such situation when the records len is 0. Sarama just reads this batch again and again. A high watermark was not reached.
We use AWS Managed Kafka in this case.
Please help. We are firing.