Skip to content

Fix race in cJSON between librdkafka and aws-c-common#94343

Merged
rschu1ze merged 3 commits intomasterfrom
maybe-fix-80866
Jan 22, 2026
Merged

Fix race in cJSON between librdkafka and aws-c-common#94343
rschu1ze merged 3 commits intomasterfrom
maybe-fix-80866

Conversation

@rschu1ze
Copy link
Copy Markdown
Member

@rschu1ze rschu1ze commented Jan 15, 2026

Fixes #80866

#80866 happens only rarely and I didn't try to reproduce it locally. Therefore, this PR takes a brute-force approach. In the absence of namespaces in C, this PR prefixes

  • all public functions,
  • all private functions, and
  • cJSON internal_hooks static variable

in librdkafka's version of cJSON with kafka_. Calls to cJSON in librdkafka were redirected to prefixed functions. This makes sure the hard way that librdkafka and aws-c-common never interact with the other's cJSON anymore.

The maintenance headaches created by this PR are manageable as upstream cJSON is kind of dead and no updates are expected in future.

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jan 15, 2026

Workflow [PR], commit [389d969]

Summary:

job_name test_name status info comment
BuzzHouse (amd_debug) failure
Logical error: 'Inconsistent AST formatting: the query: (STID: 1941-1bfa) FAIL cidb, issue ISSUE EXISTS
BuzzHouse (arm_asan) failure
Logical error: Cannot find transaction A that has started mutation B that is going to be applied to part C (STID: 3968-4fab) FAIL cidb, issue ISSUE EXISTS

@clickhouse-gh clickhouse-gh bot added pr-not-for-changelog This PR should not be mentioned in the changelog submodule changed At least one submodule changed in this PR. labels Jan 15, 2026
@rschu1ze rschu1ze changed the title Maybe fix #80866 Fix race in cJSON from librdkafka and aws-c-common Jan 15, 2026
@rschu1ze rschu1ze changed the title Fix race in cJSON from librdkafka and aws-c-common Fix race in cJSON between librdkafka and aws-c-common Jan 15, 2026
@rschu1ze
Copy link
Copy Markdown
Member Author

@antaljanosbenjamin perhaps for review?

@antaljanosbenjamin antaljanosbenjamin self-assigned this Jan 16, 2026
@antaljanosbenjamin
Copy link
Copy Markdown
Member

Based on this comment, it is not clear to me if this is always a good thing. However, I assume that is solved in the meantime. If not, we can always revert the PR. There is one thing though: could you please add your patch to the file in our fork of librdkafka where we describe what needs to be applied? I think what you need to do:

  1. I would create a section similar to "New fixes for 2.8", something like "Our changes for 2.8", because this is something we never want to merge back to librdkafka.
  2. Add this patch, probabaly referencing only your commit should be fine. If you don't mind adding a note about searching for new cJSON usage is newer kafka versions, it would nice to mention that.
  3. Extend the git commands at the end of the document.

This way if we want to update librdkafka to a new version, we know what we have to do to keep the current working state of our fork.

@antaljanosbenjamin
Copy link
Copy Markdown
Member

The changes themselves and the idea I think is okay. I am kind of unsure about that parsing thing, and this way we would definitely use the kafka included cJSON more, but let's see what happens.

@rschu1ze
Copy link
Copy Markdown
Member Author

rschu1ze commented Jan 18, 2026

@antaljanosbenjamin I did as you said, thanks. Wasn't aware that our librdkafka fork comes with such sophisticated upgrade instructions. Definitely useful.

About confluentinc/librdkafka#4159 (comment): There are two ways how cJSON can be misused at the moment:

  • a race for the same static initialization handler from librdkafka and aws-c-common, this is fixed by this PR,
  • potential non-tread-safety, happens if the same user (e.g. librdkafka) calls certain cJSON functions concurrently (here are some docs about this problem). There is however a chance that librdkafka's usage pattern is safe. Not addressed by this PR. In any case, the crashstack would look different than the one which is fixed here. We could resolve that issue separately if it happens to occur (haven't seen it in CI though).

@rschu1ze rschu1ze added this pull request to the merge queue Jan 22, 2026
Merged via the queue into master with commit c473ad7 Jan 22, 2026
129 of 132 checks passed
@rschu1ze rschu1ze deleted the maybe-fix-80866 branch January 22, 2026 22:13
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label Jan 22, 2026
@antaljanosbenjamin antaljanosbenjamin added the post-approved Approved, but after the PR is merged. label Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

post-approved Approved, but after the PR is merged. pr-not-for-changelog This PR should not be mentioned in the changelog pr-synced-to-cloud The PR is synced to the cloud repo submodule changed At least one submodule changed in this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cJSON conflicts: aws-c-common / librdkafka (data-race in cJSON)

3 participants