-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Support string typed metrics #9139
Description
Proposal
Use case. Why is this important?
There are cases where the interpretation of other metric values depends on mode-like properties. For example, on the Z system machines, logical partitions can dedicate or share the processors. That is represented as a string enum typed resource property "processor-mode" with values "shared" or "dedicated". That mode is important to know if you want to calculate derived metrics or define alerts based on other integer or float-typed metrics around the processors (I don't go into the details here). In our case, the processor mode of a logical partition can be changed over time, so it is appropriate to transfer it along with the other metrics whose interpretation depends on it.
Another use case is status values that someone wants to put into the metric store for correlating other metrics with it, or simply for long term storage of the status. For example, if a logical partition is in the stopped state, it still has processors assigned in its definition, but they are now used (time-shared) for other partitions. That needs to be taken into account when calculating derived metrics, and therefore it is important to have status information available along with the metrics.
The use cases above are not unique for Z systems. I assume that similar use cases exist in almost any complex environment, e.g. in container based environments, or in cloud services.
I did read issue #2227, but this is now 5 years old and there has been development since then. Other monitoring solutions meanwhile support string-typed metrics. Here are just some that I stumbled across:
- SysDig: https://docs.sysdig.com/en/metrics-dictionary.html.
- Nagios supports enumerated state values (https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/3/en/statetypes.html).
- Zabbix supports items that can have "short character data" as a value (https://www.zabbix.com/documentation/current/manual/config/items/item).
Alternatives
I do understand that we can map string enum typed metrics into integer values, and that is also how we will start out supporting this in our exporter (https://github.com/zhmcclient/zhmc-prometheus-exporter), but it is just way more natural to be able to use the string enum values directly, and avoids having to document long mapping lists for all kinds of status or mode properties.
Another dimension of alternatives would be to transport string typed values from the monitored system using another monitoring solution that supports this, but that puts the burden on the user to now maintain two metric gatherer environments on each system, and it is also a slippery slope since it opens the door to future consolidation in the "wrong" direction.