Skip to content

The specified key does not exist #48474

@CheSema

Description

@CheSema

Stress Test (debug)

https://s3.amazonaws.com/clickhouse-test-reports/47104/554c9285347b80663a494aae25c8b7181c5ae1a5/stress_test__debug_.html

for example the line
/var/log/clickhouse-server/clickhouse-server.final.log:2023.04.06 06:19:54.813250 [ 32032 ] {} <Error> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): while loading part 2_1_1_0 on path store/75a/75a80e37-9556-4417-8e63-527508abe126/2_1_1_0: Code: 499. DB::Exception: The specified key does not exist.: while reading key: test/dnb/juitwdfwgdhufslujokimemazcxpt, from bucket: test. (S3_ERROR), Stack trace (when copying this message, always include the lines below):

the key has been written and once read at least

cat clickhouse-server.stress.log  | grep juitwdfwgdhufslujokimemazcxpt
grep: (standard input): binary file matches
ubuntu@ip-10-1-13-94:~/work$ cat clickhouse-server.stress.log  | grep -a juitwdfwgdhufslujokimemazcxpt
2023.04.06 05:58:35.799258 [ 2772 ] {eaabc052-75a3-40f9-adff-17bb22d0837e} <Trace> WriteBufferFromS3: Making single part upload. Bucket: test, Key: test/dnb/juitwdfwgdhufslujokimemazcxpt, Size: 70, WithPool: true
2023.04.06 05:58:35.807235 [ 2067 ] {eaabc052-75a3-40f9-adff-17bb22d0837e} <Trace> WriteBufferFromS3: Single part upload has completed. Bucket: test, Key: test/dnb/juitwdfwgdhufslujokimemazcxpt, Object size: 70, WithPool: true
2023.04.06 05:58:37.130189 [ 2772 ] {eaabc052-75a3-40f9-adff-17bb22d0837e} <Trace> WriteBufferFromS3: Checking object test/dnb/juitwdfwgdhufslujokimemazcxpt exists after upload
2023.04.06 05:58:37.134141 [ 2772 ] {eaabc052-75a3-40f9-adff-17bb22d0837e} <Trace> WriteBufferFromS3: Object test/dnb/juitwdfwgdhufslujokimemazcxpt exists after upload
2023.04.06 05:58:37.807686 [ 1910 ] {} <Test> ReadBufferFromS3: Read S3 object. Bucket: test, Key: test/dnb/juitwdfwgdhufslujokimemazcxpt, Version: Latest, Range: 0-69
2023.04.06 05:58:39.357498 [ 1907 ] {} <Test> ReadBufferFromS3: Read S3 object. Bucket: test, Key: test/dnb/juitwdfwgdhufslujokimemazcxpt, Version: Latest, Range: 0-69
2023.04.06 05:58:41.635790 [ 1908 ] {} <Test> ReadBufferFromS3: Read S3 object. Bucket: test, Key: test/dnb/juitwdfwgdhufslujokimemazcxpt, Version: Latest, Range: 0-69
2023.04.06 05:58:43.962010 [ 1911 ] {} <Test> ReadBufferFromS3: Read S3 object. Bucket: test, Key: test/dnb/juitwdfwgdhufslujokimemazcxpt, Version: Latest, Range: 0-69
2023.04.06 05:58:45.456952 [ 1630 ] {59bd6098-e575-49be-ab37-4aa9816a086a} <Test> ReadBufferFromS3: Read S3 object. Bucket: test, Key: test/dnb/juitwdfwgdhufslujokimemazcxpt, Version: Latest, Range: 0-69

After that that part 2_1_1_0 is being merged

$ cat clickhouse-server.stress.log  | grep -a 75a80e37-9556-4417-8e63-527508abe126 | grep -a  2_1_1_0
2023.04.06 05:58:39.165686 [ 1907 ] {} <Debug> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): Fetching part 2_1_1_0 from /test/01154_move_partition_long_test_mri0otkd/s1/src/replicas/r1_1832171352
2023.04.06 05:58:39.312703 [ 1907 ] {} <Debug> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126) (Fetcher): Downloading part 2_1_1_0 unique id test/feu/kzxeehumnmayqpjrhjtfikrhoirzc metadata onto disk s3.
2023.04.06 05:58:39.312874 [ 1907 ] {} <Trace> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): Set zookeeper temporary ephemeral lock /clickhouse/zero_copy/zero_copy_s3/21d9b077-ce57-49dc-99a5-6e0852561207/2_1_1_0/test_feu_kzxeehumnmayqpjrhjtfikrhoirzc/r1_4291439624
2023.04.06 05:58:39.493109 [ 1907 ] {} <Trace> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): Set zookeeper persistent lock /clickhouse/zero_copy/zero_copy_s3/21d9b077-ce57-49dc-99a5-6e0852561207/2_1_1_0/test_feu_kzxeehumnmayqpjrhjtfikrhoirzc/r1_4291439624
2023.04.06 05:58:39.519834 [ 1907 ] {} <Debug> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126) (Fetcher): Download of part 2_1_1_0 unique id test/feu/kzxeehumnmayqpjrhjtfikrhoirzc metadata onto disk s3 finished.
2023.04.06 05:58:39.556399 [ 1907 ] {} <Trace> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): Renaming temporary part tmp-fetch_2_1_1_0 to 2_1_1_0 with tid (1, 1, 00000000-0000-0000-0000-000000000000).
2023.04.06 05:58:39.609951 [ 1907 ] {} <Test> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): preparePartForCommit: inserting 2_1_1_0 (state PreActive) into data_parts_indexes
2023.04.06 05:58:39.634741 [ 1907 ] {} <Trace> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): Set zookeeper persistent lock /clickhouse/zero_copy/zero_copy_s3/21d9b077-ce57-49dc-99a5-6e0852561207/2_1_1_0/test_feu_kzxeehumnmayqpjrhjtfikrhoirzc/r1_4291439624
2023.04.06 05:58:39.646941 [ 1907 ] {} <Debug> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): Fetched part 2_1_1_0 from /test/01154_move_partition_long_test_mri0otkd/s1/src/replicas/r1_1832171352
2023.04.06 05:59:30.316041 [ 1891 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:30.654064 [ 1891 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:30.858568 [ 1893 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.273448 [ 1882 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.349823 [ 1891 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.567292 [ 1895 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.703101 [ 1897 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.819764 [ 1892 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.934300 [ 1893 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:32.177973 [ 1892 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:32.262037 [ 1893 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:32.373372 [ 1894 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:33.315178 [ 1894 ] {} <Test> VersionMetadata: Trying to lock removal_tid by (1, 1, 00000000-0000-0000-0000-000000000000), table: test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126), part: 2_1_1_0
2023.04.06 05:59:33.320683 [ 1894 ] {} <Debug> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): Part 2_1_1_0 is rendered obsolete by fetching part 2_0_5_1
$ cat clickhouse-server.stress.log  | grep -a '75a80e37-9556-4417-8e63-527508abe126::2_0_5_1'
2023.04.06 05:59:30.316041 [ 1891 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:30.316230 [ 1891 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Don't have all parts (at least 2_5_5_0 is missing) for merge 2_0_5_1; will try to fetch it instead. Either pool for fetches is starving, see background_fetches_pool_size, or none of active replicas has it
2023.04.06 05:59:30.378822 [ 1891 ] {} <Information> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Code: 234. DB::Exception: No active replica has part 2_0_5_1 or covering part (cannot execute queue-0000000063: MERGE_PARTS with virtual parts [2_0_5_1]). (NO_REPLICA_HAS_PART)
2023.04.06 05:59:30.654064 [ 1891 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:30.654313 [ 1891 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Don't have all parts (at least 2_5_5_0 is missing) for merge 2_0_5_1; will try to fetch it instead. Either pool for fetches is starving, see background_fetches_pool_size, or none of active replicas has it
2023.04.06 05:59:30.739067 [ 1891 ] {} <Information> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Code: 234. DB::Exception: No active replica has part 2_0_5_1 or covering part (cannot execute queue-0000000063: MERGE_PARTS with virtual parts [2_0_5_1]). (NO_REPLICA_HAS_PART)
2023.04.06 05:59:30.858568 [ 1893 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:30.858715 [ 1893 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Don't have all parts (at least 2_5_5_0 is missing) for merge 2_0_5_1; will try to fetch it instead. Either pool for fetches is starving, see background_fetches_pool_size, or none of active replicas has it
2023.04.06 05:59:30.955263 [ 1893 ] {} <Information> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Code: 234. DB::Exception: No active replica has part 2_0_5_1 or covering part (cannot execute queue-0000000063: MERGE_PARTS with virtual parts [2_0_5_1]). (NO_REPLICA_HAS_PART)
2023.04.06 05:59:31.273448 [ 1882 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.315332 [ 1882 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Merge of part 2_0_5_1 started by some other replica, will wait it and fetch merged part
2023.04.06 05:59:31.349823 [ 1891 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.404328 [ 1891 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Merge of part 2_0_5_1 started by some other replica, will wait it and fetch merged part
2023.04.06 05:59:31.567292 [ 1895 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.613995 [ 1895 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Merge of part 2_0_5_1 started by some other replica, will wait it and fetch merged part
2023.04.06 05:59:31.703101 [ 1897 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.726014 [ 1897 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Merge of part 2_0_5_1 started by some other replica, will wait it and fetch merged part
2023.04.06 05:59:31.819764 [ 1892 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.852933 [ 1892 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Merge of part 2_0_5_1 started by some other replica, will wait it and fetch merged part
2023.04.06 05:59:31.934300 [ 1893 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:31.963715 [ 1893 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Merge of part 2_0_5_1 started by some other replica, will wait it and fetch merged part
2023.04.06 05:59:32.177973 [ 1892 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:32.208900 [ 1892 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Merge of part 2_0_5_1 started by some other replica, will wait it and fetch merged part
2023.04.06 05:59:32.262037 [ 1893 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:32.292510 [ 1893 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Merge of part 2_0_5_1 started by some other replica, will wait it and fetch merged part
2023.04.06 05:59:32.373372 [ 1894 ] {} <Trace> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): Executing log entry to merge parts 2_0_0_0, 2_1_1_0, 2_2_2_0, 2_3_3_0, 2_5_5_0 to 2_0_5_1
2023.04.06 05:59:33.040704 [ 1894 ] {} <Debug> 75a80e37-9556-4417-8e63-527508abe126::2_0_5_1 (MergeFromLogEntryTask): We took zero copy lock, but merge of part 2_0_5_1 finished by some other replica, will release lock and download merged part to avoid data duplication

Errors appear in clickhouse-server.final.log:
https://s3.amazonaws.com/clickhouse-test-reports/47104/554c9285347b80663a494aae25c8b7181c5ae1a5/stress_test__debug_/clickhouse-server.final.log

2023.04.06 06:19:54.651780 [ 32032 ] {} <Trace> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): Loading Outdated part 2_1_1_0 from disk s3
2023.04.06 06:19:54.813250 [ 32032 ] {} <Error> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): while loading part 2_1_1_0 on path store/75a/75a80e37-9556-4417-8e63-527508abe126/2_1_1_0: Code: 499. DB::Exception: The specified key does not exist.: while reading key: test/dnb/juitwdfwgdhufslujokimemazcxpt, from bucket: test. (S3_ERROR), Stack trace (when copying this message, always include the lines below):
2023.04.06 06:19:54.816499 [ 32032 ] {} <Error> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): Detaching broken part /var/lib/clickhouse/disks/s3/store/75a/75a80e37-9556-4417-8e63-527508abe126/2_1_1_0 (size: 1.06 KiB). If it happened after update, it is likely because of backward incompatibility. You need to resolve this manually

So. It is OUTDATED.
Covered Active part is loaded successfully.

2023.04.06 06:19:29.125056 [ 32338 ] {} <Trace> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): Loading Active part 2_0_5_1 from disk s3
2023.04.06 06:19:29.493513 [ 32338 ] {} <Test> test_mri0otkd.src_10 (75a80e37-9556-4417-8e63-527508abe126): loadDataPart: inserting 2_0_5_1 (state Active) into data_parts_indexes

For me it absolutely seems as OK.

Metadata

Metadata

Assignees

Labels

fuzzProblem found by one of the fuzzerstestingSpecial issue with list of bugs found by CI

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions