PHP benchmark: report histogram and unconstrained scenarios#12760
PHP benchmark: report histogram and unconstrained scenarios#12760ZhouyihaiDing merged 1 commit intogrpc:masterfrom
Conversation
src/ruby/qps/proxy-worker.rb
Outdated
There was a problem hiding this comment.
Hi there,
It shouldn't be assumed that there are exactly 3 marks. It is possible that we would want more statistics reporting points than that, and all the benchmarks are written to assume that there can be an unlimited number of marks. That's why the mark has a reset field; to allow some marks to specify that the stats should be cleared. It's fine to react to marks differently from other operations, but I don't think you should assume a count of marks.
There was a problem hiding this comment.
Thanks for the correction. Then I also can't assume that only the last mark is always for the benchmark.
If there is a middle "mark" used for the benchmark, uploading the data still consumes time for the next mark, which lower the qps. I didn't see where I can pause the drive, letting ruby finish uploading before start next "mark".
I guess I can continue the signal way to let php know when "mark" is called. But instead of uploading all latencies, I should implement histogram in the php side and upload it once before each mark.
Do you have some suggestions to let php knows when to upload other than using signal? The response from the server seems useless because it remains the same until it is closed.
n the benchmark framework I
There was a problem hiding this comment.
Is that right that "reqs.each do |req|" works as
while (1){
in = stream.recv(); // the program will hang there for warm-up time, lots of benchmark time.
case in.isSetup:
case in.inMark: get data from php, upload histogram to the driver.
}
Thus in the in.inMark, streamming upload will always be expensive and it cost time from the next "mark". What I'm doing now is to make the split latency sent together, which may speed up cache or buffer re-use or something. But the best way is just uploading once, right?
There was a problem hiding this comment.
I think the right thing is to restructure the proxy worker. Right now the PHP client talks to the proxy worker over a 1-sided client-side streaming RPC called ReportTime. Instead, it should be changed to a bidi-streaming RPC. On the bidi streaming RPC, the proxy worker should send each mark to the PHP client. The PHP client should read that and then respond with a histogram that it has been collecting over the execution. The proxy worker can then pass that histogram back to the driver. That will require maintaining a histogram in the PHP client and sending a ClientStats proto back to the proxy worker on a mark. Does that make sense? Maybe the Quit and other messages could also be passed from the proxy worker to the PHP client like that to take signal out of the mix.
There was a problem hiding this comment.
That sounds cool by using a bidi-streaming. I'll try use async way or fork a child process listening to this bidi-streaming RPC in sync way. Thanks for your help!
I'll close this PR and reopen it until I finish them.
There was a problem hiding this comment.
Oh yes, I forgot about that point, that neither of those options is really natural in PHP. I think that's why I went with the 1-directional streaming thing in the first place. Probably needs better investigation then.
There was a problem hiding this comment.
Another possibility: go with the example that you're currently using (storing the latency results in an array and sending it out as a batch) but send it out periodically like once every 100 ms or something so that you get the batching effects that you're talking about but without having to alter the framework altogether. The important thing is just to send far fewer stats-reporting RPCs.
There was a problem hiding this comment.
I'm still using signal to let them talk to each other. I tried create a child process and let a streaming RPC listening there. But it seems php-gRPC doesn't work on child process(the same code works in the parent, will hang in the child). I don't know it's php's problem or php-gRPC's problem yet.
Currently every mark will trigger php uploading, and continue until the data is uploaded.
I'll implement histogram in the php side next.
|
I tried to create a child process in php and let it handle stats stream but it seems grpc has some problem inside child process. Thus I can only handle the signal inside main process. |
ad2cdfb to
2d52819
Compare
|
Result shows batch sending result for every 1000 times is faster than using signal as trigger. |
41d4ac6 to
b1365d4
Compare
656f19e to
18310cb
Compare
|
I changed it in the ruby side:
|
src/php/tests/qps/client.php
Outdated
There was a problem hiding this comment.
space after if for consistency
src/php/tests/qps/histogram.php
Outdated
There was a problem hiding this comment.
add a space around the = for consistency
There was a problem hiding this comment.
Oh. I was using this to print the benchmark scenario from other languages to see how many channel/RPC per channel they are using.
I'll delete it.
|
1 similar comment
|
|
1 similar comment
|
|
1 similar comment
|
|
Asan C (internal CI) - |
There was a problem hiding this comment.
nit: I'd just enumerate "php7" and "php7_protobuf_c" here, no need to use wildcards.
Also no reason to pass $CONFIG as an argument - that variable will be available in build_performance_php.sh anyway - taking an env variable, passing it as a param only to make it an env variable seems unnecessary.
There was a problem hiding this comment.
Done. I use wildcards in case we need add benchmark for both php and php7. But that requires switch language version in VM, thus only focus on build benchmark for php7 now.
There was a problem hiding this comment.
Also, looks like you are now building php twice - once for php7 and once for php7_protobuf_c, given the build process seems to be the same regardless of using protobuf_c, maybe you can just build everything as language "php"?
Basically, that would mean differentiating the language name in scenario_config.py ( = str(language)) and "language to build" by introducing language.lang_to_build().
Building PHP twice is not a problem if:
1.) the second build takes negligible time (everything already built and installed)
2.) you can guarantee that the PHP builds won't run concurrently and therefore collide with each other.
There was a problem hiding this comment.
so the builds are not running concurrently as they are executed one by one by build_perfomance.sh
One possible trick (and possibly the simplest):
if both php7 and php7_protobuf_c are passed to build_performance.sh,
it can remember that php has already been build when building the first build is triggered
(PHP_ALREADY_BUILT=the_lang_that_built_it), and the second request to build can be skipped ("Skipping PHP build as already built by $PHP_ALREADY_BUILT").
There was a problem hiding this comment.
Done. Thank you for the suggestions. I check PHP_ALREADY_BUILT and make it build only once.
|
jtattermusch
left a comment
There was a problem hiding this comment.
LGTM for the changes to performance testing scripts (.py and .sh). Thanks for addressing my comments, I think now things look much better!
After merging, please monitor the benchmark suite for a while to check that PHP builds are not introducing extra flakiness.
|
|
|
|
There was a problem hiding this comment.
LGTM. I'm fine with merging, but I would like an opinion from @jtattermusch or @adelez : does it make sense to dashboard the PHP QPS test? The model of parallelism used here (separate process per client channel) is completely different from the other wrapped languages (which use 1 process and may have multiple threads). I know that this is since multithreading isn't idiomatic in PHP, but it might muddle the actual contents of the multilingual dashboard.
php7 build once
|
|
|
|
Not related: |
|
@vjpai @adelez @jtattermusch
|
|
Now, it sends a histogram stats to the ruby proxy for every 2000 ping-pong.
QPS tests in my own local machine (protobuf c extension, 20s warm-up, 60s benchmark):