F190416 01:58:51.634989 172922 storage/replica_consistency.go:220 [n5,consistencyChecker,s5,r590/1:/Table/68/1/{29/4/2…-31/4/1…}] consistency check failed with 1 inconsistent replicas
goroutine 172922 [running]:
github.com/cockroachdb/cockroach/pkg/util/log.getStacks(0xc000056301, 0xc000056300, 0x5449800, 0x1e)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:1020 +0xd4
github.com/cockroachdb/cockroach/pkg/util/log.(*loggingT).outputLogEntry(0x5bdd700, 0xc000000004, 0x5449860, 0x1e, 0xdc, 0xc008bfa5a0, 0x79)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:878 +0x93d
github.com/cockroachdb/cockroach/pkg/util/log.addStructured(0x3aa1620, 0xc0071899e0, 0x4, 0x2, 0x33b2862, 0x36, 0xc01f06cce0, 0x1, 0x1)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/structured.go:85 +0x2d8
github.com/cockroachdb/cockroach/pkg/util/log.logDepth(0x3aa1620, 0xc0071899e0, 0x1, 0xc000000004, 0x33b2862, 0x36, 0xc01f06cce0, 0x1, 0x1)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:71 +0x8c
github.com/cockroachdb/cockroach/pkg/util/log.Fatalf(0x3aa1620, 0xc0071899e0, 0x33b2862, 0x36, 0xc01f06cce0, 0x1, 0x1)
/go/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:182 +0x7e
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).CheckConsistency(0xc023eeb400, 0x3aa1620, 0xc0071899e0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica_consistency.go:220 +0x6ce
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).CheckConsistency(0xc023eeb400, 0x3aa1620, 0xc0071899e0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica_consistency.go:229 +0x81b
github.com/cockroachdb/cockroach/pkg/storage.(*consistencyQueue).process(0xc0003de2a0, 0x3aa1620, 0xc0071899e0, 0xc023eeb400, 0x0, 0x0, 0x0)
/go/src/github.com/cockroachdb/cockroach/pkg/storage/consistency_queue.go:125 +0x210
I haven't looked at much else yet.
To do / understand
nathan-tpcc-geo:7end up rotating into a new log file to log the fatal error without retaining any prior ones? (answer: log rotation storage: consistency check failure during import #36861 (comment))This looks very similar to #35424, so it's possible that that issue wasn't fully resolved. I was most of the way through a TPC-C 4k import when a node died due to a consistency check failure.
Cockroach SHA: 3ebed10
Notes:
Cluster:
nathan-tpcc-geo(stopped, extended for 48h)Cockroach nodes:
1,2,4,5,7,8,10,11Inconsistent range:
r590Replicas:
nathan-tpcc-geo:2/n2/r3,nathan-tpcc-geo:5/n4/r4, andnathan-tpcc-geo:7/n5/r1Inconsistent replica:
nathan-tpcc-geo:7/n5/r1Replicas in zones:
europe-west2-b,europe-west4-b, andasia-northeast1-brespectivelyInitial Investigation
Unlike in the later reproductions of #35424, replica 1's Raft log is an exact prefix of replica 3 and 4's, so this doesn't look like the same issue we saw later in that issue.
I haven't looked at much else yet.
r590 Range _ Debug _ Cockroach Console.pdf