Test setup on my laptop:
Running fling like so:
bins/opt/fling_server --bind=localhost:10000
bins/opt/fling_client --target=localhost:10000
Versus running qps like so:
bins/opt/qps_server -port 10000
bins/opt/qps_client --server_port 10000 -client_channels=1 -client_threads=1 -num_rpcs=100000
I get a latency of 15-20us for fling, and around 180us for qps.
I think the experimental setup is similar.
The difference needs to be thoroughly explained (and eliminated) -- protobuf serialization, and API wrapping cannot take 160us.