Abort connections with no valid endpoint#10415
Conversation
|
Can you please update the PR description with a brief summary of the problem? In particular, #10405 describes a much more complex scenario than what should be necessary to reproduce this.
Also, please share examples of how this behaves now. |
|
@julianbrost Please see the updated description. I hope this makes things clearer. |
|
Please verify whether ApiUsers authenticated using a TLS client certificate still work, see https://icinga.com/docs/icinga-2/latest/doc/12-icinga2-api/#icinga2-api-authentication |
4407a89 to
f942abc
Compare
|
@julianbrost You are right, a certificate based ApiUser could no longer connect with this PR. I've just pushed an updated version that should fix this. It returns a bit later and only for verified JSON-RPC connections. Now both verified connections with no endpoint (i.e. ApiUser with Are there any other corner cases I'm not thinking about that need further testing? |
f942abc to
1b3a0a8
Compare
1b3a0a8 to
353386f
Compare
yhabteab
left a comment
There was a problem hiding this comment.
LFTM!
HTTP requests using client crts:
$ curl --cacert /var/lib/icinga2/certs/ca.crt --cert admin.crt --key admin.key -skS 'https://localhost:5667/v1/objects/services?pretty=1'
{
"results": [
{
"attrs": {
"__name": "test!service",
"acknowledgement": 2,
------
[2025-05-22 14:46:12 +0200] information/ApiListener: New client connection for identity 'admin' from [::1]:54903 (no Endpoint object found for identity)
[2025-05-22 14:46:12 +0200] information/HttpServerConnection: Request GET /v1/objects/services?pretty=1 (from [::1]:54903, user: admin, agent: curl/8.7.1, status: OK) took total 1ms.
[2025-05-22 14:46:12 +0200] information/HttpServerConnection: HTTP client disconnected (from [::1]:54903)
RPC client:
$ echo -n '86:{"jsonrpc":"2.0","method":"icinga::Hello","params":{"capabilities":3,"version":21450}}\r\n' | openssl s_client -connect localhost:5667 -cert admin.crt -key admin.key
...
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 19 (self-signed certificate in certificate chain)
---
DONE---
[2025-05-22 14:53:44 +0200] information/ApiListener: New client connection for identity 'admin' from [::1]:54984 (no Endpoint object found for identity)
[2025-05-22 14:53:45 +0200] warning/ApiListener: Unknown endpoint 'admin' with valid certificate. Aborting JSON-RPC connection.
---
Problem
In #10405 the problem is that incoming connections with valid certificates, but from endpoints that are not defined locally will get added as anonymous clients (via
ApiListener::AddAnonymousClient()) and then hang around essentially forever since the check inJsonRpcConnection::CheckLiveness()only puts anonymous connections on a timeout if they are unauthenticated.To Reproduce
#10405 describes a more complicated setup in detail, but the simplest setup to reproduce the issue is to have a working, authenticated master/agent or master/satellite setup and then comment out the master endpoint in the
zones.confof the agent/satellite and restart.Solution
Abort connections early when no endpoint is defined for the incoming connection. This is done by returning early from
ApiListener::NewClientHandlerInternalwhen the certificate is validated, but no endpoint is configured for the remote.Caveats
Since the client closes the connection very early it is possible that the other side tries to read from or write to the socket, which then fails. For example this message+stracktrace can appear in the log:
A more complex solution that does not close the connection so abruptly for the remote would involve both sides of the
JsonRpcConnectionconfirming the connection via an exchange of messages and should be considered in a future refactoring of theNewClientHandlercode.For now this closes #10405 by making the cluster checks fail reliably and keeps the parent from blindly sending requests to clients that just silently discard them.