Skip to content

add jvm.file_descriptor.limit metric#16174

Merged
trask merged 15 commits intoopen-telemetry:mainfrom
SylvainJuge:jvm-file-descriptor-limit
Mar 4, 2026
Merged

add jvm.file_descriptor.limit metric#16174
trask merged 15 commits intoopen-telemetry:mainfrom
SylvainJuge:jvm-file-descriptor-limit

Conversation

@SylvainJuge
Copy link
Copy Markdown
Contributor

@SylvainJuge SylvainJuge commented Feb 13, 2026

Add support for jvm.file_descriptor.limit metric.

Opened as a draft because:

Update: adding implementation in runtime-metrics with e1e2cb4 as #16087 might take a bit of time to be merged and dealing with conflicts should be minimal work, hence making this ready for review for implementation, the not yet released semconv link still makes test fail.

Comment thread instrumentation/jmx-metrics/library/jvm.md Outdated
@SylvainJuge SylvainJuge self-assigned this Feb 16, 2026
Comment on lines +76 to +89
metric ->
metric
.hasName("jvm.file_descriptor.count")
.hasInstrumentationScope(EXPECTED_SCOPE)
.hasDescription("Number of open file descriptors as reported by the JVM.")
.hasUnit("{file_descriptor}")
.hasLongSumSatisfying(sum -> sum.hasPointsSatisfying(point -> point.hasValue(0))),
metric ->
metric
.hasName("jvm.file_descriptor.limit")
.hasInstrumentationScope(EXPECTED_SCOPE)
.hasDescription("Measure of max open file descriptors as reported by the JVM.")
.hasUnit("{file_descriptor}")
.hasLongSumSatisfying(sum -> sum.hasPointsSatisfying(point -> point.hasValue(0))));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] I've extended existing tests as-is, but I think what we really should have is no metric/data-point produced when the source value is negative instead of having a zero. Here when no limit it set (thus negative value) it does produces a limit of zero, which is incorrect and will thus be interpreted/filtered by the consumer. Do we have a way to express that using the metrics API at registration time ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have limited options:

  • Close the instrument when we see a negative observation
  • Check before registering the instrument and don't register it in the first place
  • Send -1

Do you know if this can happen in practice, or is just theoretical?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the case at hand, I think this is mostly theoretical, as the implementation in the JVM implementation should always return a non-negative value and Integer.MAX_VALUE in the worst case (which could also be something we'd like to filter-out).

I don't know why we haven't seen this while implementing #13589 to filter negative metric values in jmx-metrics, if I understand this correctly we should have had metrics with type updowncounter to report zeroes instead of filtering negative values. I will probably need to investigate this further.

Among all the suggested options, closing the instrument seems the most "clean" way to solve this, are there any particular downside apart from the extra complexity ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Among all the suggested options, closing the instrument seems the most "clean" way to solve this, are there any particular downside apart from the extra complexity ?

it seems good to me also

cc @jack-berg

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why a 0 value is being reported here. Given that the registered callback has if (value >= 0), no metric should be exported.

Is it asserting against data from a previous collection?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trask I think @SylvainJuge already summed it up. If the value is negative it is not recorded. If a value has not been recoded the instrument won't be reported. Once a value has been recoded the instrument is reported which makes the tests flaky as registerObservers_NegativeValue now will see different values depending on whether it runs first or after registerObservers.

the underlying resource limit can't be modified at runtime for the duration of the JVM execution
there is no transition from "with limit" to "no limit", so it's either set at startup or not set.

prlimit should let one change the limits for a running process https://man7.org/linux/man-pages/man1/prlimit.1.html
Note that according to an AI getMaxFileDescriptorCount does not return a negative value when no limit is set but rather Long.MAX_VALUE. -1 is returned when the limit can't be retrieved.
I think the current implementation is fine. The only thing that needs to be changed is that test methods must be made to run in a predictable order with @TestMethodOrder(MethodOrderer.OrderAnnotation.class) and org.junit.jupiter.api.Order annotations, a comment that explains why this is needed should also be added.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test order sounds like it would fix it, but I think this also indicates an underlying issue in the test harness's ability to isolate tests from each other.

Here's a minimal (passing) test that demonstrates the behavior I'm talking about: open-telemetry/opentelemetry-java@main...jack-berg:opentelemetry-java:async-demo

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jack-berg, your example allows to prove that there is in fact no metric sample produced when there is no call to .record(...).

I think the overall problem we have here is that the test checks for the presence of a given metric and we have no convenient way to test the absence of one.

So, in order to make the test reliable and still relevant, we can test both negative and non-negative values in the same method by making the mock return multiple values, I've implemented this in 0311845, removing the condition in the implementation make the test fail.

Regarding the fact that the implementation might return Long.MAX_VALUE, this is also covered in this test, but it works without explicitly ignoring it so maybe by a happy accident.

Also, this should be quite rare to get it because I expect most OSes to use some default value rather than "no limit".

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have no convenient way to test the absence of one

I think we typically use Thread.sleep(100)clearData()Thread.sleep(100) → assert empty pattern, e.g.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @trask, I'll try to reuse that when needed, do you think/prefer me to rework the test here ? I think it should be ok to be merged as-is.

@SylvainJuge
Copy link
Copy Markdown
Contributor Author

Semconv 1.40 has been published, thus we can mark this as ready as markdown links are now working.

@SylvainJuge SylvainJuge marked this pull request as ready for review February 23, 2026 14:10
@SylvainJuge SylvainJuge requested a review from a team as a code owner February 23, 2026 14:10
Comment on lines +76 to +89
metric ->
metric
.hasName("jvm.file_descriptor.count")
.hasInstrumentationScope(EXPECTED_SCOPE)
.hasDescription("Number of open file descriptors as reported by the JVM.")
.hasUnit("{file_descriptor}")
.hasLongSumSatisfying(sum -> sum.hasPointsSatisfying(point -> point.hasValue(0))),
metric ->
metric
.hasName("jvm.file_descriptor.limit")
.hasInstrumentationScope(EXPECTED_SCOPE)
.hasDescription("Measure of max open file descriptors as reported by the JVM.")
.hasUnit("{file_descriptor}")
.hasLongSumSatisfying(sum -> sum.hasPointsSatisfying(point -> point.hasValue(0))));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SylvainJuge any luck tracking this down?

@trask trask merged commit 6743036 into open-telemetry:main Mar 4, 2026
93 checks passed
@SylvainJuge SylvainJuge deleted the jvm-file-descriptor-limit branch March 5, 2026 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants