The unittest_handle_GetTelemetrySubscriptions test introduced with e3ae6e6 and v2.10.1 by Emanuele Sabellico (@emasab), revealed a long-standing pointer size and endianness issue likely introduced with KIP 714 compression support in v2.5.0.
Specifically, the v2.11.1 Debian s390x build failed with this error:
38s RDUT: FAIL: rdkafka_request.c:7012: unittest_handle_GetTelemetrySubscriptions: assert failed: rk->rk_telemetry.accepted_compression_types[0] == RD_KAFKA_COMPRESSION_GZIP: Expected 'gzip' compression type, got 'codec0x1000000?'
s390x is a 64-bit big-endian architecture. The value above is 0x1000000 == htonl(0x1) == htonl(RD_KAFKA_COMPRESSION_GZIP).
Digging a bit deeper:
rk_telemetry.accepted_compression_types[] is of type rd_kafka_compression_t, which is an enum. In standard C, sizeof(rd_kafka_compression_t) == sizeof(enum) == sizeof(int). In this case, that's == 4 (32-bits).
The code being tested is:
void rd_kafka_handle_GetTelemetrySubscriptions(rd_kafka_t *rk,
[...]
void *opaque) {
[...]
rk->rk_telemetry.accepted_compression_types_cnt = arraycnt;
rk->rk_telemetry.accepted_compression_types =
rd_calloc(arraycnt, sizeof(rd_kafka_compression_t));
for (i = 0; i < (size_t)arraycnt; i++)
rd_kafka_buf_read_i8(
rkbuf,
&rk->rk_telemetry.accepted_compression_types[i]);
However, rd_kafka_buf_read_i8() reads a int8_t, i.e. size == 1. That's a bug. Changing this to use an intermediate 8-bit value e.g.
for (i = 0; i < (size_t)arraycnt; i++) {
int8_t value;
rd_kafka_buf_read_i8(rkbuf, &value);
rk->rk_telemetry.accepted_compression_types[i] = value; // implicit cast int8_t -> int
}
...makes the code correct, and the unit test pass across both little-endian and big-endian architectures.
Note that the unit test, unittest_handle_GetTelemetrySubscriptions has this code:
rd_kafka_buf_write_i8(rkbuf, RD_KAFKA_COMPRESSION_GZIP)
That's actually correct -- that's what the code being tested above expects: an int8_t.
Additionally, this is the push code, which is NOT tested by the unit test:
rd_kafka_PushTelemetryRequest(rd_kafka_broker_t *rkb,
[...]
const rd_kafka_compression_t compression_type,
[...]
void *opaque) {
[...]
size_t len = sizeof(rd_kafka_Uuid_t) + sizeof(int32_t) +
sizeof(rd_bool_t) + sizeof(compression_type) +
metrics_size;
rkbuf = rd_kafka_buf_new_flexver_request(rkb, RD_KAFKAP_PushTelemetry,
1, len, rd_true);
rd_kafka_buf_write_uuid(rkbuf, client_instance_id);
rd_kafka_buf_write_i32(rkbuf, subscription_id);
rd_kafka_buf_write_bool(rkbuf, terminating);
rd_kafka_buf_write_i8(rkbuf, compression_type);
[...]
}
A sizeof(compression_type) is being allocated, but an i8 is actually being written. rd_kafka_buf_write_i8() accepts an int8_t as its second argument, but an int (e.g. int32_t) is being passed, which results in an implicit casted.
I don't think this is an overrun, nor manifests as an actual bug beyond wasting 4 bytes of memory to store 1, but it is evidence of confusion with regards to the size of kafka_compression_t and implicit casts surrounding it, which can be a breeding ground for more serious bugs. At minimum, this should be rd_kafka_buf_write_i8(rkbuf, (int8_t)compression_type);
Instead of casting from/to int8_t on reads and writes (as suggested above), perhaps it should be considered to make the in-memory format match the buffer size, i.e. not using an enum for kafka_compression_t but an int8_t.
(I did not check any of the other types used in telemetry requests.)
The
unittest_handle_GetTelemetrySubscriptionstest introduced with e3ae6e6 and v2.10.1 by Emanuele Sabellico (@emasab), revealed a long-standing pointer size and endianness issue likely introduced with KIP 714 compression support in v2.5.0.Specifically, the v2.11.1 Debian s390x build failed with this error:
s390x is a 64-bit big-endian architecture. The value above is 0x1000000 == htonl(0x1) == htonl(RD_KAFKA_COMPRESSION_GZIP).
Digging a bit deeper:
rk_telemetry.accepted_compression_types[]is of typerd_kafka_compression_t, which is an enum. In standard C, sizeof(rd_kafka_compression_t) == sizeof(enum) == sizeof(int). In this case, that's == 4 (32-bits).The code being tested is:
However,
rd_kafka_buf_read_i8()reads a int8_t, i.e. size == 1. That's a bug. Changing this to use an intermediate 8-bit value e.g....makes the code correct, and the unit test pass across both little-endian and big-endian architectures.
Note that the unit test,
unittest_handle_GetTelemetrySubscriptionshas this code:That's actually correct -- that's what the code being tested above expects: an int8_t.
Additionally, this is the push code, which is NOT tested by the unit test:
A
sizeof(compression_type)is being allocated, but an i8 is actually being written.rd_kafka_buf_write_i8()accepts an int8_t as its second argument, but an int (e.g. int32_t) is being passed, which results in an implicit casted.I don't think this is an overrun, nor manifests as an actual bug beyond wasting 4 bytes of memory to store 1, but it is evidence of confusion with regards to the size of kafka_compression_t and implicit casts surrounding it, which can be a breeding ground for more serious bugs. At minimum, this should be
rd_kafka_buf_write_i8(rkbuf, (int8_t)compression_type);Instead of casting from/to int8_t on reads and writes (as suggested above), perhaps it should be considered to make the in-memory format match the buffer size, i.e. not using an enum for kafka_compression_t but an int8_t.
(I did not check any of the other types used in telemetry requests.)