Skip to content

Conversation

@alexmiller-apple
Copy link
Contributor

@alexmiller-apple alexmiller-apple commented Mar 13, 2020

I didn't realize that status works via collecting traceevents from workers, so I'm trimming my plans down to just "hz", and the playbooks can link to a dashboard that aggregates TLSPolicyFailure trace events itself.

A trimmed status json from a failing process now looks like:

{
    "cluster" : {
        "processes" : {
            "bd91289b650802668e2217cc68f0463b" : {
                "address" : "127.0.0.1:5000:tls",
                "network" : {
                    "tls_policy_failures" : {
                        "hz" : 0.4
                    }
                }
            }
        }
    }
}

This reports the number of policy failures over the past 5s interval.
It also is step 1 towards getting this information into status json.
This allows monitoring of TLS policy failures, but one has to go scrape
for TLSPolicyFailure trace events to figure out why they're happening.
@ajbeamon ajbeamon merged commit 5f4373c into apple:release-6.2 Mar 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants