Skip to content

Try fix intersecting parts#23997

Merged
alesapin merged 9 commits intomasterfrom
fix_intersecting_parts
May 17, 2021
Merged

Try fix intersecting parts#23997
alesapin merged 9 commits intomasterfrom
fix_intersecting_parts

Conversation

@tavplubix
Copy link
Copy Markdown
Member

@tavplubix tavplubix commented May 10, 2021

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Some ALTER PARTITION queries might cause Part A intersects previous part B and Unexpected merged part C intersecting drop range D errors in replication queue. It's fixed.
Fixes #23296.

Detailed description / Documentation draft:
Fixed two independent bugs:

  • Race between creation of MERGE_PARTS and DROP_RANGE/REPLACE_RANGE on multiple leaders
  • Incorrect virtual part name for REPLACE_RANGE which breaks invariants in replication queue and merge predicate

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label May 10, 2021
@alesapin alesapin self-assigned this May 12, 2021
@tavplubix tavplubix marked this pull request as ready for review May 13, 2021 14:05
Copy link
Copy Markdown
Member

@alesapin alesapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the current implementation, but I don't understand how it has been working before...

And maybe a backward-incompatible change would be OK.


/// Return {} because selection of merges in the partition where the column is cleared
/// should not be blocked (only execution of merges should be blocked).
if (type == CLEAR_COLUMN || type == CLEAR_INDEX)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use these types of entries in the queue, they are left just for compatibility. Maybe update comment.

getFakePartCoveringAllPartsInPartition(partition_id, drop_range, true);
std::optional<EphemeralLockInZooKeeper> delimiting_block_lock;
bool partition_was_empty = !getFakePartCoveringAllPartsInPartition(partition_id, drop_range, delimiting_block_lock, true);
if (replace && partition_was_empty)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But where is the guarantee that no concurrent insert can happen here? Or in this case it's OK?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok, all blocks with numbers less than delimiting_block_lock->getNumber() will be dropped, and newer blocks with greater numbers will not. But we must hold delimiting_block_lock until creation of DROP_RANGE/REPLACE_RANGE entry in replication log to avoid merges intersecting delimiting_block_lock->getNumber().

ops_src.emplace_back(zkutil::makeCreateRequest(
zookeeper_path + "/log/log-", entry_delete.toString(), zkutil::CreateMode::PersistentSequential));
ops_dest.emplace_back(zkutil::makeSetRequest(zookeeper_path + "/log", "", -1)); /// Just update version
ops_src.emplace_back(zkutil::makeSetRequest(zookeeper_path + "/log", "", -1)); /// Just update version
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add to comment: because merges assignment relies on it or something like this.

parts_to_remove_str += " ";
}
LOG_TRACE(log, "Replacing {} parts {}with {} parts {}", parts_to_remove.size(), parts_to_remove_str,
final_parts.size(), boost::algorithm::join(final_part_names, ", "));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw fmt::join requires less characters :)

Strings res = replace_range_entry->new_part_names;
auto drop_range_info = MergeTreePartInfo::fromPartName(replace_range_entry->drop_range_part_name, format_version);
assert(drop_range_info.getBlocksCount() != 0);
if (drop_range_info.getBlocksCount() > 1)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds weird, but maybe even create a separate function for this check?)

/// so we can distinguish dummy drop range from any real or virtual part.
/// But we should never construct such part name, even for virtual part,
/// because it can be confused with real part <partition>_0_0_0.
/// TODO get rid of this.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhh, maybe just add a flag and make it backward-incompatible?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will not be possible to backport the fix with such change

/// - drop range for MOVE PARTITION/ATTACH PARTITION FROM always contains 1 block

--right;
/// NOTE UINT_MAX was previously used as max level for REPLACE/MOVE PARTITION (it was incorrect)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to forget about it :)

{
/// Intersect left border
String error = "Unexpected merged part " + part->name + " intersecting drop range " + drop_range.getPartName();
if (!skip_intersecting_parts)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What the hell was that....

std::lock_guard merge_selecting_lock(merge_selecting_mutex);
queue.disableMergesInBlockRange(drop_range_fake_part_name);
}
/// Optional step
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad comment, need clarification.

Copy link
Copy Markdown
Member

@alesapin alesapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alesapin
Copy link
Copy Markdown
Member

Functional stateless tests flaky check (address) -- timeout is ok, the test quite long.
Integration tests (asan) -- timeout ok
Integration tests (thread) -- both failed tests are known
Stress test (thread) -- single optimize query hung on waitForTableReplicaToProcessLogEntry. Again, looks like we wait something in sanitizer allocator https://gist.github.com/alesapin/6cbc18d51d10be790e636798bec0da63

@tavplubix tavplubix mentioned this pull request May 17, 2021
tavplubix added a commit that referenced this pull request May 18, 2021
tavplubix added a commit that referenced this pull request May 18, 2021
tavplubix added a commit that referenced this pull request May 18, 2021
tavplubix added a commit that referenced this pull request May 18, 2021
alesapin added a commit that referenced this pull request May 19, 2021
Backport #23997 to 21.4: Try fix intersecting parts
alesapin added a commit that referenced this pull request May 19, 2021
Backport #23997 to 21.5: Try fix intersecting parts
alesapin added a commit that referenced this pull request May 20, 2021
Backport #23997 to 21.3: Try fix intersecting parts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Part X intersects next part Y. It is a bug.

3 participants