Add Timer::enableHRTimer() by oschaaf · Pull Request #9155 · envoyproxy/envoy

oschaaf · 2019-11-27T10:32:44Z

Adds a way to set sub-millisecond timers. This generates an opportunity for Nighthawk to improve accuracy of request-release timings. The fault filter extension might leverage this as well to inject sub millisecond delays.

Implemented as an extra call instead of an overload or change to the existing one, because that way caller intent explicitly carries information about accuracy expectations. Also, adding an extra call minimizes changes needed to accomplish this.

For best results, libevent needs to be made aware too that we'd like to use precise timers.
One way to do that is by setting an environment variable called EVENT_PRECISE_TIMER.

EVENT_PRECISE_TIMER=1 bazel-bin/nighthawk_client ...

One thing to mind is that the max sleep time will go down from std::chrono:millliseconds::duration::max() to std::chrono::microseconds::duration::max() with this. AFAICT we will still be able to sleep for ~292277 years with that.

Description:
Risk Level: Low
Testing: By proxy, via existing tests

``` Use --sandbox_debug to see verbose messages from the sandbox external/envoy/source/common/http/async_client_impl.cc:20:73: error: call to implicitly-deleted default constructor of 'const AsyncStreamImpl::NullRetryPolicy' const AsyncStreamImpl::NullRetryPolicy AsyncStreamImpl::RouteEntryImpl::retry_policy_; ^ bazel-out/k8-dbg/bin/external/envoy/source/common/http/_virtual_includes/async_client_lib/common/http/async_client_impl.h:147:33: note: default constructor of 'NullRetryPolicy' is implicitly deleted because field 'retriable_status_codes_' of const-qualified type 'const std::vector<uint32_t>' (aka 'const vector<unsigned int>') would not be initialized const std::vector<uint32_t> retriable_status_codes_; ^ 1 error generated. ``` Signed-off-by: Otto van der Schaaf <[email protected]>

Signed-off-by: Otto van der Schaaf <[email protected]>

oschaaf · 2019-11-27T10:37:17Z

/cc @htuch

Signed-off-by: Otto van der Schaaf <[email protected]>

This reverts commit eba4053. Signed-off-by: Otto van der Schaaf <[email protected]>

Signed-off-by: Otto van der Schaaf <[email protected]>

yanavlasov · 2019-11-27T18:16:49Z

If the hi-res timers are used in the nighthawk for now, would it make sense to have this functionality just there?
Also would it make sense to make hi-res time enabled in the build files by default, instead of relying on users to remember to set on the command line during build?

htuch · 2019-11-27T18:41:13Z

@yanavlasov I think it's reasonable to add this to Envoy (I suggest this to @oschaaf). Why not make these the default though, rather than requiring specific enablement? Is there a performance or stability implication?

oschaaf · 2019-11-27T19:02:55Z

If the hi-res timers are used in the nighthawk for now, would it make sense to have this functionality just there?

So I thought this might be nice to have here as it seemed the change wouldn't add a significant maintenance burden; as the implementation can be shared across the regular and non-HR versions (at least today).. and potentially Envoy extensions might benefit from it?

Also would it make sense to make hi-res time enabled in the build files by default, instead of relying on users to remember to set on the command line during build?

Yes sure; alternatively this can also be set when initializing libevent instead of via setting an environment variable. But I was trying to minimize the change;

Why not make these the default though, rather than requiring specific enablement? Is there a performance or stability implication?

I was thinking that in Nighthawk this should be enabled by default but with an opt-out escape-latch for those who seek absolute highest throughput, as I remember there is a trade-off involved [1]. IMHO Envoy would probably better be off with an opt-in for those who run specific extensions with strict latency requirements..or maybe it would be nice to facilitate extensions that need it to enable the feature at server boot time automatically; e.g. a new interface call for extensions that Envoy can query as it starts to determine if any of those has strict latency requirements.

[1] From https://sourceforge.net/p/levent/mailman/message/29143607/

There are a bunch of backends that can give us a reasonably good
monotonic timer quickly, or a very precise monotonic timer less
quickly.  For example, Linux has CLOCK_MONOTONIC_COARSE (1msec
precision), which is over twice as fast as CLOCK_MONOTONIC.  Since
epoll only lets you wait with 1msec precision,
CLOCK_MONOTONIC_COARSE is a clear win.

htuch · 2019-11-27T19:36:37Z

@oschaaf this makes sense, but the implementation seems to be basically doing HR anyway for the regular timer case.

oschaaf · 2019-11-27T19:46:48Z

@htuch this makes sense, but the implementation seems to be basically doing HR anyway for the regular timer case.

correct, but imho consumers shouldn't care about how the two calls are implemented internally; from the outside my idea is that one of them is to express the desire for a reasonably precise but efficient timer, and the other one is to express the desire of favoring high precision.

(the doxygen comments might be a good place to clarify this in case we want to take this forward)

I was thinking that this distinction might some day be useful, perhaps knowledge about caller intent might be leveraged some day to optimize with coarse grained timer coalescing when favoring efficiency

htuch · 2019-12-02T19:00:05Z

@oschaaf yeah, that's reasonable. Can you add some tests for the new method?

Signed-off-by: Otto van der Schaaf <[email protected]>

oschaaf · 2019-12-02T22:54:17Z

@oschaaf yeah, that's reasonable. Can you add some tests for the new method?

I took a stab at that. 9f00ad8 explicitly shares some test code between enableTimer() and enableHRTimer().

htuch

Thanks.. one thing I'd be curious about is whether it's possible to use simulated time to actually capture the timing precision? Not sure how feasible that is, LGTM otherwise.

Signed-off-by: Otto van der Schaaf <[email protected]>

oschaaf · 2019-12-03T08:54:22Z

Thanks.. one thing I'd be curious about is whether it's possible to use simulated time to actually capture the timing precision? Not sure how feasible that is, LGTM otherwise.

Nice, that works, and the new tests also functionally covers theoretical precision for the regular enableTimer() implementation: 2dca213

htuch

LGTM, thanks!

This reverts commit 8db1682. Signed-off-by: Otto van der Schaaf <[email protected]>

This reverts commit 8db1682. server_fuzz_test flakes in CI on super large timeout values backing this out to stabilize that test while sorting this out offline Signed-off-by: Otto van der Schaaf <[email protected]>

This reverts commit a693496. Signed-off-by: Otto van der Schaaf <[email protected]>

- Updates TimerUtils::durationToTimeval to use templating to avoid implicit conversions of durations when passing argument, thereby generating an opportunity to bound the input before doing conversions on it. Throws when passed a negative duration. - Adds some more tests around this - Slightly refactored version modified of: envoyproxy#9155, Signed-off-by: Otto van der Schaaf <[email protected]>

- Updates TimerUtils::durationToTimeval to use templating to avoid implicit conversions of durations when passing argument, thereby generating an opportunity to bound the input before doing conversions on it. Throws when passed a negative duration. - Adds some more tests around this Slightly refactored version of #9155 Risk Level: medium Testing: unit tests Signed-off-by: Otto van der Schaaf <[email protected]>

- Updates TimerUtils::durationToTimeval to use templating to avoid implicit conversions of durations when passing argument, thereby generating an opportunity to bound the input before doing conversions on it. Throws when passed a negative duration. - Adds some more tests around this Slightly refactored version of envoyproxy#9155 Risk Level: medium Testing: unit tests Signed-off-by: Otto van der Schaaf <[email protected]> Signed-off-by: Prakhar <[email protected]>

oschaaf added 2 commits November 22, 2019 11:12

Introduce enableHRTimer()

e1e83ee

Signed-off-by: Otto van der Schaaf <[email protected]>

oschaaf mentioned this pull request Nov 27, 2019

Switch to enableHRTimer -- improve request-release timing accuracy envoyproxy/nighthawk#217

Merged

4 tasks

oschaaf added 4 commits November 27, 2019 11:51

Fix dispatcher_impl_test.cc build

d438c1f

Signed-off-by: Otto van der Schaaf <[email protected]>

Tweak max duration in TimerImpl conversion test

515bce9

Signed-off-by: Otto van der Schaaf <[email protected]>

Revert "Fix compilation error"

6439fd3

This reverts commit eba4053. Signed-off-by: Otto van der Schaaf <[email protected]>

Back out ASSERT & death test. Doesn't work in CI.

85c2569

Signed-off-by: Otto van der Schaaf <[email protected]>

zuercher assigned htuch Nov 27, 2019

htuch requested a review from yanavlasov November 27, 2019 18:39

htuch assigned yanavlasov Nov 27, 2019

htuch added the waiting label Dec 2, 2019

Merge remote-tracking branch 'upstream/master' into hrtimer

98615a6

repokitteh-read-only bot removed the waiting label Dec 2, 2019

Test enableHRTimer

9f00ad8

Signed-off-by: Otto van der Schaaf <[email protected]>

yanavlasov previously approved these changes Dec 3, 2019

View reviewed changes

htuch reviewed Dec 3, 2019

View reviewed changes

Test that timers theoretically will fire on time (HR/non-HR)

2dca213

Signed-off-by: Otto van der Schaaf <[email protected]>

oschaaf dismissed yanavlasov’s stale review via 2dca213 December 3, 2019 08:50

yanavlasov approved these changes Dec 3, 2019

View reviewed changes

htuch approved these changes Dec 3, 2019

View reviewed changes

htuch merged commit 8db1682 into envoyproxy:master Dec 3, 2019

lizan mentioned this pull request Dec 3, 2019

server_fuzz_test is flaky #9207

Closed

oschaaf added a commit to oschaaf/envoy that referenced this pull request Dec 4, 2019

Revert "Add Timer::enableHRTimer() (envoyproxy#9155)"

a693496

This reverts commit 8db1682. Signed-off-by: Otto van der Schaaf <[email protected]>

oschaaf mentioned this pull request Dec 4, 2019

Add TimerImpl::enableHRTimer - take two #9229

Merged

oschaaf added a commit to oschaaf/envoy that referenced this pull request Dec 4, 2019

Revert "Revert "Add Timer::enableHRTimer() (envoyproxy#9155)""

7a427c6

This reverts commit a693496. Signed-off-by: Otto van der Schaaf <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Timer::enableHRTimer()#9155

Add Timer::enableHRTimer()#9155
htuch merged 9 commits intoenvoyproxy:masterfrom
oschaaf:hrtimer

oschaaf commented Nov 27, 2019 •

edited

Loading

Uh oh!

oschaaf commented Nov 27, 2019

Uh oh!

yanavlasov commented Nov 27, 2019

Uh oh!

htuch commented Nov 27, 2019

Uh oh!

oschaaf commented Nov 27, 2019

Uh oh!

htuch commented Nov 27, 2019

Uh oh!

oschaaf commented Nov 27, 2019

Uh oh!

htuch commented Dec 2, 2019

Uh oh!

oschaaf commented Dec 2, 2019

Uh oh!

htuch left a comment

Uh oh!

oschaaf commented Dec 3, 2019

Uh oh!

htuch left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

oschaaf commented Nov 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oschaaf commented Nov 27, 2019

Uh oh!

yanavlasov commented Nov 27, 2019

Uh oh!

htuch commented Nov 27, 2019

Uh oh!

oschaaf commented Nov 27, 2019

Uh oh!

htuch commented Nov 27, 2019

Uh oh!

oschaaf commented Nov 27, 2019

Uh oh!

htuch commented Dec 2, 2019

Uh oh!

oschaaf commented Dec 2, 2019

Uh oh!

htuch left a comment

Choose a reason for hiding this comment

Uh oh!

oschaaf commented Dec 3, 2019

Uh oh!

htuch left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

oschaaf commented Nov 27, 2019 •

edited

Loading