SOL-143245: Updated librdkafka to 2.12.1#6
Conversation
…a instances (confluentinc#4724) Circular dependencies from a partition fetch queue message to the same partition blocked the destroy of an instance, that happened in case the partition was removed from the cluster while it was being consumed. Solved by purging internal partition queue, after being stopped and removed, to allow reference count to reach zero and trigger a destroy. Purging internal fetch queue on removing the partition only for the consumer.
* Security upgrade for OpenSSL and Curl, CVEs fixed: OpenSSL - CVE-2024-2511 - CVE-2024-4603 - CVE-2024-4741 - CVE-2024-5535 - CVE-2024-6119 CURL - CVE-2024-8096 - CVE-2024-7264 - CVE-2024-6874 - CVE-2024-6197 * Fix for curl configure failure caused by curl/curl#14373
) must be equal to the server sent nonce, that already contains the client side nonce. librdkafka was incorrectly concatenating the client side nonce again, leading to this fix being made on AK side, released in 3.8.1, with endsWith instead of equals. apache/kafka@0a00456
except the Style check job because it needs clang format 10 Fix to inherit javac path, needed by test 0098
Mock handler implementation Rename current consumer protocol from generic to classic Mock handler with automatic or manual assignment More consumer group metadata getters Test helpers Configurable session timeout and HB interval Fix mock handler ListOffsets response LeaderEpoch instead of CurrentLeaderEpoch Integration tests passing with AK trunk Improve documentation and KIP 848 specific mock tests Add mock tests for unknown topic id in metadata request and partial reconciliation Make test 0147 more reliable Fix test 0106 after HB timeout change Exclude test case with AK trunk Rename rd_kafka_buf_write_tags to rd_kafka_buf_write_tags_empty Trivup 0.12.5 can run a KafkaCluster directly with KRaft and AK trunk Trivup 0.12.6 build with a specific commit Trivup 0.12.7 with fixes for AK 3.8.0 and Py 3.12 New version of trivup 0.12.7 to fix an issue with apache/kafka#16464 on AK > 3.8.0 Static group membership mock tests Move test 0147 to a different PR Disable interactive "needsrestart" prompt
* test_read_file can read binary files too * Trivup 0.12.8 * Read certificate CA chain when set using a configuration setter with PEM format. Test that CA with untrusted chain fails authentication. * Test untrusted certificate signed with an intermediate CA * Remove private key and duplicate certs from pem client certificate * Print logs sent as events * Trivup now already inheriths the environment in interactive mode * Use namespace to avoid conflicts on TestEventCb
…h with client certificate chain (confluentinc#4900) Failing test: expect the error code that is received when no certificate is sent instead of the one received when it's sent but not trusted. Client cert callback to check if trusted certificate authorities match with client certificate chain. Log a warning when client certificate isn't sent --------- Co-authored-by: trnguyencflt <[email protected]>
… leader epoch (confluentinc#4901) Failing tests including for confluentinc#4796 and confluentinc#4804 Closes confluentinc#4796 and confluentinc#4804 CHANGELOG Fix for the correct expected RPC code in test 0139 Apply same fix to metadata update operation too Don't change rktp state to active when there's no leader but wait it's available to validate it Comment about excluded -1 value
An incorrect assumption is made that libssl is built with support for the (now-deprecated) ENGINE API if it is provided by OpenSSL >= 1.1.0 or LibreSSL. OPENSSL_NO_ENGINE is defined by OpenSSL and all of its forks if the ENGINE API was disabled at compile-time - ensure that the definition of OPENSSL_NO_ENGINE is taken into account when using ENGINE features.
…and client is using SASL authentication only (confluentinc#4936) without any client certificate set
* removing generated internal project.yml * removing generated public project.yml --------- Co-authored-by: service-bot-app[bot] <189278048+service-bot-app[bot]@users.noreply.github.com>
* removing generated internal project.yml * removing generated public project.yml --------- Co-authored-by: service-bot-app[bot] <189278048+service-bot-app[bot]@users.noreply.github.com>
* Verify Ubuntu 24.04 and arm64 packages * Add Semaphore task for verifying
…#4967) Style fixed check_features.c
as it's not used anymore. Was used for AppVeyor CI builds.
… number of iterations (confluentinc#5002)
…tinc#4908) Closes: confluentinc#4059. Commits during a rebalance could cause to lose the assignment if the generation id was bumped by second join group request. Solved by not re-joining the group in case an illegal generation error happens during a rebalance. Happening since v1.6.0.
Semaphore pipeline linked to a task that runs the full test suite with customizable parameters. Contains a promotion to run it automatically on master commits only.
…tes (confluentinc#5006) Remove double quotes
as timeout and checking after wakeups if it's been reached, Avoids yielding earlier than requested because of spourious wakeups. Fix flakiness in many tests, especially 0080
because of the fetch backoff left from previous broker. Resets the fetch backoff when the partitions joins a new broker.
due to latency increase applying to all RPCs, including ApiVersions, leading to the timeout happening before the produce request is sent. The error is IN_QUEUE instead of IN_FLIGHT, and the status becomes NOT_PERSISTED instead of POSSIBLY_PERSISTED. Fixed using the mock cluster instead of sockem and applying the latency only to the Produce request.
…e with big-endian architectures (confluentinc#5183) * Fix compression types read issue in GetTelemetrySubscriptions response for big-endian architectures * Decrease allocated buffer size in `rd_kafka_PushTelemetryRequest` and explicitly cast the enum
…roupHeartbeat not updating member epoch in a case (confluentinc#4672) [KIP-848] Fixed a condition where error was being raised in commit due to old error in the topic partition [KIP-848] Fix discarding heartbeat response without epoch update when leaving during inflight HB
Re-bootstrap is now triggered only after metadata.recovery.rebootstrap.trigger.ms have passed since first metadata refresh request after last successful metadata response. The calculation was since last successful metadata response so it's possible it did overlap with the periodic topic.metadata.refresh.interval.ms and cause a re-bootstrap even if not needed.
…m them (confluentinc#4931) * Fetched committed offsets should be validated before starting to consume from it. Failing test and mock handler implementation for returning the committed offset leader epoch instead of current leader epoch. * Validate the offsets before starting to fetch assigned partitions * Add more test cases for partition assignment offset validation * Fix for test 0139 subtest `do_test_store_offset_without_leader_epoch` . When fetching an offset it returns the leader epoch used when committing, not the current leader epoch. Given the mock cluster fix the test needs to be changed. * Fix test `0139` subtest `do_test_list_offsets_leader_change`: use cloned partition list for listing offsets, to avoid the fake leader epoch is then used for validation when assigning. Fix ListOffsets mock handler for logging the correct returned leader epoch. * Changelog entry * Reduce number of tests in quick mode * Add a new fetch state when finishing validating and starting to seek after a truncation, to avoid a second repeated validation and possibly duplicated messages. * Increase single test timeout * Fix to leave the group in `rd_kafka_cgrp_incr_unassign_done` if terminate was requested, as done in `rd_kafka_cgrp_unassign_done` and `rd_kafka_cgrp_consumer_incr_unassign_done` * Mock cluster, set the group as empty when last member leaves instead of triggering a rebalance * Test 0139 with mock cluster marked as local. Doesn't delete topic if tests are local only as it's possible there's no cluster to connect to and it speeds up completing the test * Resume the partition before fetch start or before validation
--------- Co-authored-by: Arthur O'Dwyer <[email protected]>
* Revert setting timeout to infinity * style fix * Changelog change * Changelog changes * Changelog change
* Fix flakyness test 0085 * Errors that cause a refresh coordinator like NOT_COORDINATOR during an offset fetch should not be propagated to the application.
…al promotions (confluentinc#5191) * Pipeline improvements about machine types and auto-cancel * Use cached docker image for integration tests, style checks, docs build * vcpkg cache * msys2 cache * Upgrade macOS agents
…fluentinc#5155) * Implementation of OAUTHBEARER/OIDC metadata based authentication, initially supporting the Azure UAMI method. * Tests with trivup 0.14.0 supporting metadata based authentications * Add documentation and changelog entry * Rename `azure` value to `azure_imds` and replace UAMI that is the identity with IMDS that is the authentication service * Extract authentication URL and rename internal function and enums * Changes to name the configuration property "query" instead of "params" as in other implementations and to make it optional if the default endpoint is overridden.
…sume from them (confluentinc#4931)" (confluentinc#5207) This reverts commit 13a2bba.
…odes (confluentinc#5194) * Add test cases for new OffsetCommit and OffsetFetch Error Codes * Testcase for discarding the member epoch in a consumer group heartbeat response when leaving with an inflight HB
confluentinc#5214) * Changelog changes and some modification to the KIP-848 migration guide * Add that KIP-848 is not enabled by default and other PR comments
* Downgrade min supported OSX version to 13 * Version upgrade to v2.12.1
|
Please mark whether you used Copilot to assist coding in this PR
|
| rttinfo[0] = 0; | ||
|
|
||
| rkb->rkb_c.skip_broker_down = rd_true; | ||
| rd_kafka_broker_fail(rkb, LOG_WARNING, |
There was a problem hiding this comment.
Confluence librdkafka has changed this to rd_kafka_broker_planned_fail(...). As far as I can see, the difference between their call to rd_kafka_broker_planned_fail(...) and
this merge of
rkb->rkb_c.skip_broker_down = rd_true;
rd_kafka_broker_fail(rkb, LOG_WARNING, ...
is just the log level: this code will result in LOG_WARNING, rd_kafka_broker_planned_fail(...) will result in a LOG_DEBUG.
I think we should take Confuence's change to rd_kafka_broker_planned_fail(...). Originally, rd_kafka_broker.c was full of LOG_ERR, which caused AFW and QA to freak out. So, I did a bulk lowering of many of rd_kafka_broker.c's LOG_ERR to LOG_WARNING. It looks like Confluence has gone even further, lowering LOG_ERR to LOG_DEBUG. I think we should take Confluence's judgement on the log severity.
| "Expected token as a string value"); | ||
| goto fail; | ||
| } | ||
|
|
There was a problem hiding this comment.
Prior to the Confluence merge, there would have been code here like:
if (rk->rk_conf.debug_sensitive) {
rd_kafka_dbg(rk, SECURITY, "OIDC",
"Received JWT token \"%s\"", jwt_token);
}
This merge has stripped this out. (With the code re-arrangement, I don't think rk is even in scope in this function.) I think I'm OK with that; we can add it back later if need be.
kwdubuc
left a comment
There was a problem hiding this comment.
I reviewed only the merge collisions, and have only one suggested change in rdkafka_broker.c.
…ibrdkafka/master/SOL-143245/0
…t_error() to report all errors to the application
… disconnect error string
I got conflicts in: