Skip to content

fix: add read deadline to tls write#3283

Merged
dnwe merged 1 commit intoIBM:mainfrom
bvalente:tls-deadline
Sep 15, 2025
Merged

fix: add read deadline to tls write#3283
dnwe merged 1 commit intoIBM:mainfrom
bvalente:tls-deadline

Conversation

@bvalente
Copy link
Copy Markdown
Contributor

@bvalente bvalente commented Sep 8, 2025

Related to:

We're using https://github.com/Mongey/terraform-provider-kafka to manage Kafka Topics with Terraform. Recently we've changed from Plaintext communications to AWS IAM Authentication. When doing so, our provider sometimes would hang indefinitely on some plans. We pinned this to the kafka.t3.small cluster tiers, as these have several limitations, including a maximum of 4 TCP connections per second.

While debugging the provider, we understood that the Call Stack was stuck on writing to the cluster, more specifically right on the first communication that it was trying to do with the clusters. Reading through the code, we found a very interesting comment for the Write function of the TLS package.

https://github.com/golang/go/blob/go1.23.0/src/crypto/tls/conn.go#L1192-L1195

// As Write calls [Conn.Handshake], in order to prevent indefinite blocking a deadline
// must be set for both [Conn.Read] and Write before Write is called when the handshake
// has not yet completed. See [Conn.SetDeadline], [Conn.SetReadDeadline], and
// [Conn.SetWriteDeadline].

Based on this, TLS requires both Write and Read Deadlines to be set because the Write function may do a handshake on the fist communication, and the handshake both Writes and Reads.

I believe that in our case, since we are working with brokers that don't have a very reliable network, sometimes the handshake would not progress on the server side, and we would indefinitely wait for a Read that would never come.

After implementing this change in our local workstation, instead of experiencing indefinite hanging, the program would finally report some time of error:

Error: kafka: client has run out of available brokers to talk to: read tcp 10.xxx.xxx.xxx:59582->10.xxx.xxx.xxx:9098: i/o timeout

Copy link
Copy Markdown
Collaborator

@puellanivis puellanivis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only thing I could think is to join the time.Now() calls into a common local variable, so they’re both based on the same “now”.

But yeah, good change all over otherwise. 👍

Signed-off-by: Bernardo Valente <[email protected]>
@bvalente
Copy link
Copy Markdown
Contributor Author

bvalente commented Sep 9, 2025

@puellanivis thank you for the review

I addressed your comment, and force pushed after rebasing with master

Copy link
Copy Markdown
Collaborator

@puellanivis puellanivis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. :)

@bvalente
Copy link
Copy Markdown
Contributor Author

Hello @puellanivis, what would be the process to get this merged and tagged? Is there a timeline, or anything I can do from our side? 🙂

@puellanivis
Copy link
Copy Markdown
Collaborator

Sometimes reviews from IBM can take a while. I don’t actually have any ability to even approve in my code review, let alone merge anything. I’m just a third-party F/OSS contributor helping out with code reviews.

Copy link
Copy Markdown
Collaborator

@dnwe dnwe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bvalente thanks! this was a good catch

@dnwe dnwe added the fix label Sep 15, 2025
@dnwe dnwe merged commit 25368c4 into IBM:main Sep 15, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants