Skip to content

profiling: Benchmark and profiling scripts#278

Closed
pothos wants to merge 1 commit intolinkerd:masterfrom
kinvolk-archives:benchmark-and-profiling
Closed

profiling: Benchmark and profiling scripts#278
pothos wants to merge 1 commit intolinkerd:masterfrom
kinvolk-archives:benchmark-and-profiling

Conversation

@pothos
Copy link
Contributor

@pothos pothos commented Jun 25, 2019

A benchmark/profiling script for local development
or a CI helps to catch performance regressions early
and find spots for optimization.

The benchmark setup consists of a cargo test
that reuses the test infrastructure to forward
localhost connections. This test is skipped by
default unless an env var is set.
The benchmark load comes from a fortio server
and client for HTTP/gRPC req/s latency measurement
and from an iperf server and client for TCP throughput
measurement.
In addition to the fortio UI to inspect the benchmark
data, the results are also stored to a summary text file
which can be used to plot the difference of the summary
results of, e.g., two git branches.

The profiling setup is the same as above but also
runs "perf" or "memory-profilier" to sample the
call stacks at either runtime or on heap allocation
calls. This requires a special debug build with
optimizations, that can be generated with a build script.
The results can be inspected as interactive flamegraph
SVGs in the browser.

Please follow the instructions in the profiling/README.md
file on how the scripts are used.

Signed-off-by: Kai Lüke [email protected]

Discussion

This commit is based on our work in the branch https://github.com/kinvolk/linkerd2-proxy/tree/benchmark-and-profiling. Here we have removed the wss.pl-based memory usage summary and the wrk2+actix-web and strest-grpc test loads but you can find them in the linked branch. We replaced wrk2 and strest-grpc with fortio because it turned out to give more consistent results (and uses the same software stack for both HTTP and gRPC), simplify the setup, and include a UI to inspect and compare the benchmark results.
The plot.py script is still useful to visualize the difference in the results of two branches, specially in a CI report where the fortio web UI is of less use.

We would like to get feedback. Is is better to stay with wrk2 and strest-grpc?
The current default req/s values come from my system and you might need to adjust them to have one scenario with medium load and one with almost maximum load through fortio.
Do you get only small variations between different runs? Maybe the runtime of the benchmark should be longer (increase duration or iterations).

The scripts for heap and perf profiling and benchmarking only differ in a few lines. We can unify them with if branches. We didn't rename them yet to ease comparison with the above linked branch, but, e.g., the fortio suffix can go.

We hope you find the local benchmark data and the perf/heap flamegraphs useful.

@wmorgan wmorgan requested a review from olix0r June 25, 2019 16:47
@olix0r
Copy link
Member

olix0r commented Jun 25, 2019

@pothos Thanks, Kai. I probably won't have a chance to look at this much before next week, but we really appreciate you sharing this!

A benchmark/profiling script for local development
or a CI helps to catch performance regressions early
and find spots for optimization.

The benchmark setup consists of a cargo test
that reuses the test infrastructure to forward
localhost connections. This test is skipped by
default unless an env var is set.
The benchmark load comes from a fortio server
and client for HTTP/gRPC req/s latency measurement
and from an iperf server and client for TCP throughput
measurement.
In addition to the fortio UI to inspect the benchmark
data, the results are also stored to a summary text file
which can be used to plot the difference of the summary
results of, e.g., two git branches.

The profiling setup is the same as above but also
runs "perf" or "memory-profilier" to sample the
call stacks at either runtime or on heap allocation
calls. This requires a special debug build with
optimizations, that can be generated with a build script.
The results can be inspected as interactive flamegraph
SVGs in the browser.

Please follow the instructions in the profiling/README.md
file on how the scripts are used.

Signed-off-by: Kai Lüke <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants