Skip to content

Conversation

@RincewindsHat
Copy link
Member

@RincewindsHat RincewindsHat commented Feb 19, 2025

This PR contains my approach to a new centralised infrastructure for output generation in C.
The idea is to have a data structure for check results, which contain the test result (exit code), the free form output and the perfdata.

The introduced funtionality should reduce the various hacks which were introduced to get the output in the right form and, at least in the bigger picture, reduce complexity overall.
An additional advantage is the possibility to have different formats which can be selected at runtime.
These formats currently include a "one line" format (similar to what the plugins produce right now), a "IcingaWeb2" format which looks generally nice and especially in "IcingaWeb2" the main web interface in Icinga setups.
More experimental is a JSON based format (currently named MP_FORMAT_TEST_JSON) with which the end to end tests for plugins can be improved (better selection of partial results or perfdata).
Currently this mostly useful for tests, but might be a good approach for a future replacement of the current output.

In addition to the general functionality I modified check_swap to use the new functionality to have a demonstrator.

For generating JSON I did include (vendor) the cJSON library in lib/vendor/cJSON.

replaces #1979

@RincewindsHat
Copy link
Member Author

@sni @waja little heads-up. Lots of changes, but I think this is progress.
I am open to criticism if you have the time for that.

@sni
Copy link
Contributor

sni commented Feb 20, 2025

that's a lot of code... :-) From a quick look, i would prefer the SWAP OK - or OK - format over the OK: variant. I'd say it looks cleaner. However, our own guidelines suggest using

Output should be in the format:
SERVICE STATUS: Information text

But I'd rather change the guidelines :-)

Besides that, I'd would prefer to keep vendor specific things out. But i am still curious, is there an example of how the icinga2 specific output differs from the normal output. And
isn't this the thing this PR tries to solve? Using the JSON output, icinga could format it
in any way it wants.

@RincewindsHat
Copy link
Member Author

It currently looks like this:

One-line

This output format is for systems which display only a single line of output. I don't know if and which systems that would be

# ./plugins/check_swap --output-format one-line
OK: Swap - [OK] - 99.6541% free (22274MiB out of 22351MiB)

icingaweb2

This format is multi-line and adds some indentation to improve readability if there are multiple "subchecks" (individual tests which are summarized by the main test).
The format [Status] gets formated nicely in Icingaweb2 which is where I see Monitoring Plugin Output (since I am an Icinga guy).

./plugins/check_swap --output-format icingaweb2                                                                                                                            
[OK] - Swap
	\_[OK] - 99.6542% free (22274MiB out of 22351MiB)|swap=23356252000B;;;0;23437308000

Summary only

This format was just added because I can. Maybe there is a purpose for that. Very short messages for alerting via SMS or something like that.

/plugins/check_swap --output-format summary-only                                                                                                                          
OK: Swap

mp-test-json

This is a JSON based output format which is a) good for internal testing (plugins/t/check_swap.t in the PR branch) and b) a proof of concept for some further experimentation.
with this format the exit code is no longer used for communication the test status and the plugin will exit with 0 if everything worked fine, indepently of the result of the test.
This avoids the ambiguity of not knowing whether a plugin crashed or the test failed.

./plugins/check_swap --output-format mp-test-json                                                                                                                          
{"state":"OK","summary":"Swap","checks":[{"output":"99.6542% free (22274MiB out of 22351MiB)","state":"OK","perfdata":[{"label":"swap","value":{"type":"uint","value":"23356252000"},"uom":"B","min":{"type":"int","value":"0"},"max":{"type":"uint","value":"23437308000"}}]}]}

@RincewindsHat
Copy link
Member Author

that's a lot of code... :-) From a quick look, i would prefer the SWAP OK - or OK - format over the OK: variant. I'd say it looks cleaner. However, our own guidelines suggest using

Output should be in the format:
SERVICE STATUS: Information text

But I'd rather change the guidelines :-)

No strong opinions on that, whatever satisfies the most people.

Besides that, I'd would prefer to keep vendor specific things out. But i am still curious, is there an example of how the icinga2 specific output differs from the normal output. And isn't this the thing this PR tries to solve? Using the JSON output, icinga could format it in any way it wants.

I do agree with that, in my ideal world formatting is not our problem and we just provide the pure information (ideally this is where the JSON experiment is going).
Icinga2 does currently not expect (or generally work with) JSON data (although there might be an ugly hack possible), but I still want to have useful output (looking specifically at check_disk with > 6 filesystems).
The name can be changed of course, I just did whatever would look good in Icingaweb2 right now.

@sni
Copy link
Contributor

sni commented Feb 20, 2025

thanks for the clarification.
I guess we can skip the one-line output. All monitoring systems support multiple lines for like
20 years or so. I'd also skip the summary variant, because you usually do not run the plugin
for a specific purpose like sms. Normally you already did run the plugin and then decide what
to do with the result. Then it would be nice to have some options, but I'd say, this can be
done with the JSON form.
For me, 2 output formats would be sufficient. JSON and a human readable clean text variant.
Naemon for example uses multiple lines of output, but the first line is the most important one.
All other lines of output are just details. Usually the first line is either short enough already or a summary. But with summary i think of a bit more the just "OK".

@RincewindsHat
Copy link
Member Author

The current design idea is, that the summary is either set by the plugin or generated automatically (mostly not that usefull probably). So that would be the duty of check_swap (in this case).

Is there some highlighting in Naemon (Thruk then I guess) which would be nice to have?

Thanks for the Input. I think I will remove one-line and summary-only then. Less code is better.

@sni
Copy link
Contributor

sni commented Feb 20, 2025

Is there some highlighting in Naemon (Thruk then I guess) which would be nice to have?

No idea what you mean.

Some more ideas and issues with plugin output we face so far is that some plugins have html output to present the data in a more human friendly way. However, this looks bad on a pager or in a sms. But issues like this could easily be solved with JSON data.
So I'd be happy with having JSON data and more or less the output like it is right now.
Maybe even both at once somehow.

@waja
Copy link
Member

waja commented Feb 21, 2025

Making output more streamlined is good, so I'm very happy anybody is caring about that.
As far I remember, I'm using plugins output in some setups for detecting things. check_snmp has those options and the snmpd is using plugins via extends. That's heavily depending on the output. Maybe same goes for checks via ssh.
So if we change the output (for the status or so) significant, we should document that carefully and prominently. When we release that, I would also recommend to do some "bigger" release. Don't know if it should be 3.0 or if a 2.5 is fine.

@RincewindsHat
Copy link
Member Author

@waja Could you elaborate a bit on what you mean with "I'm using plugins output in some setups for detecting things"? Do you have some tooling which tries to parse the output?

And yes, I would say 3.0 for changes like these.

@RincewindsHat RincewindsHat force-pushed the feature/new_output_infra branch from 44b4427 to 7c8c9d9 Compare February 21, 2025 12:57
@RincewindsHat RincewindsHat merged commit 75658bd into monitoring-plugins:master Feb 21, 2025
7 checks passed
@RincewindsHat RincewindsHat deleted the feature/new_output_infra branch February 21, 2025 13:33
@waja
Copy link
Member

waja commented Feb 21, 2025

Imagine anybody uses the monitoring plugins as snmp extends as described in https://rafael-monteiro.medium.com/monitoring-linux-systems-with-snmp-extend-method-55769ddae505 and gather the status via check_snmp with -r, --ereg=REGEX and match for something like '^OK:', cause the plugin used for the extend returns this, when everything is okay.
Now we change (maybe) the output of the Plugin when it's okay, so the regex needs to be changed. I don't know how usually this is, but seems I was not the only one with such an idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants