Skip to content

New Tests to guard against regression when routing and weighted target#24055

Merged
donnadionne merged 1 commit intogrpc:masterfrom
donnadionne:new_tests
Sep 8, 2020
Merged

New Tests to guard against regression when routing and weighted target#24055
donnadionne merged 1 commit intogrpc:masterfrom
donnadionne:new_tests

Conversation

@donnadionne
Copy link
Copy Markdown
Contributor

@donnadionne donnadionne commented Sep 3, 2020

logic are moved from policies to XdsConfigSelector:

  1. Cluster update test: we never had a test that explicitly updated
    clusters multiple times.
  2. A cluster update test that includes long running RPCs, in particular,
    RPCs not yet committed.

@donnadionne


This change is Reviewable

@donnadionne donnadionne added the release notes: no Indicates if PR should not be in release notes label Sep 3, 2020
Copy link
Copy Markdown
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is good coverage. I think the tests can be simplified a bit.

Please let me know if you have any questions. Thanks!

Reviewed 1 of 1 files at r1.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @donnadionne)


test/cpp/end2end/xds_end2end_test.cc, line 1534 at r1 (raw file):

  std::tuple<int, int, int> WaitForAllBackends(
      size_t start_index = 0, size_t stop_index = 0, bool reset_counters = true,
      bool allow_failures = false,

I suggest putting this parameter at the end of the list. That way, you don't need to modify any of the existing call sites, since they'll continue to use the default value.


test/cpp/end2end/xds_end2end_test.cc, line 3644 at r1 (raw file):

  RouteConfiguration new_route_config =
      balancers_[0]->ads_service()->default_route_config();
  auto* route1 = new_route_config.mutable_virtual_hosts(0)->mutable_routes(0);

Why do we need two different routes here? It seems like the test would be simpler and would still cover the case we're interested in if we have just one route and send an update that changes the cluster that that route points to.


test/cpp/end2end/xds_end2end_test.cc, line 3734 at r1 (raw file):

  RouteConfiguration new_route_config =
      balancers_[0]->ads_service()->default_route_config();
  auto* route1 = new_route_config.mutable_virtual_hosts(0)->mutable_routes(0);

I think this test only needs one route as well. The steps should be:

  1. Start with backend 0 down.
  2. Send a RouteConfiguration with a single default route that points to cluster A, which contains backend 0.
  3. Start the sending thread, which will send exactly one RPC with no deadline and with wait_for_ready=true. This RPC will not complete until after backend 0 is started (which we'll do in step 7 below).
  4. In the main thread, send a non-wait_for_ready RPC. This RPC should fail, which will tell us that the client has received the update and attempted to connect.
  5. Send a RouteConfiguration update that changes the default route to point to cluster B, which contains backend 1.
  6. Call WaitForBackend(1, /*reset_counters=*/false). This ensures that the client has processed the update.
  7. Now start backend 0. This will allow the sending thread to complete its RPC.
  8. Join the sending thread.
  9. Verify that backends 0 and 1 each received 1 RPC.

It is conceivably possible that this test could be a bit flaky, because the sending thread might not actually have done the LB pick by the time the client sees the second RouteConfiguration update. However, given that we're sending another RPC in between, the odds are good. I suggest running this 1000 times in RBE to make sure it doesn't flake too much.


test/cpp/end2end/xds_end2end_test.cc, line 3755 at r1 (raw file):

  ResetBackendCounters();
  // Start sending RPCs before changing route configuration.
  std::atomic_bool stop_rpc{false};

This should not be necessary, nor should we need a loop in the sending thread. The sending thread should just send a single RPC and should return as soon as the RPC is done.


test/cpp/end2end/xds_end2end_test.cc, line 3761 at r1 (raw file):

          SendRpc(RpcOptions()
                      .set_rpc_service(SERVICE_ECHO1)
                      .set_wait_for_ready(true));

In addition to setting wait_for_ready, you also need to send the RPC with no deadline. I suggest changing SendRpc() to not set a deadline if rpc_options.timeout_ms is 0.


test/cpp/end2end/xds_end2end_test.cc, line 3767 at r1 (raw file):

  // Bring down the current backend, this will delay route picking time,
  // resulting in un-committed RPCs.
  ShutdownBackend(1);

This needs to be done at the very top of the test, before the channel starts trying to connect.

Copy link
Copy Markdown
Contributor Author

@donnadionne donnadionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @markdroth)


test/cpp/end2end/xds_end2end_test.cc, line 1534 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

I suggest putting this parameter at the end of the list. That way, you don't need to modify any of the existing call sites, since they'll continue to use the default value.

Done.


test/cpp/end2end/xds_end2end_test.cc, line 3644 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Why do we need two different routes here? It seems like the test would be simpler and would still cover the case we're interested in if we have just one route and send an update that changes the cluster that that route points to.

Done.


test/cpp/end2end/xds_end2end_test.cc, line 3734 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

I think this test only needs one route as well. The steps should be:

  1. Start with backend 0 down.
  2. Send a RouteConfiguration with a single default route that points to cluster A, which contains backend 0.
  3. Start the sending thread, which will send exactly one RPC with no deadline and with wait_for_ready=true. This RPC will not complete until after backend 0 is started (which we'll do in step 7 below).
  4. In the main thread, send a non-wait_for_ready RPC. This RPC should fail, which will tell us that the client has received the update and attempted to connect.
  5. Send a RouteConfiguration update that changes the default route to point to cluster B, which contains backend 1.
  6. Call WaitForBackend(1, /*reset_counters=*/false). This ensures that the client has processed the update.
  7. Now start backend 0. This will allow the sending thread to complete its RPC.
  8. Join the sending thread.
  9. Verify that backends 0 and 1 each received 1 RPC.

It is conceivably possible that this test could be a bit flaky, because the sending thread might not actually have done the LB pick by the time the client sees the second RouteConfiguration update. However, given that we're sending another RPC in between, the odds are good. I suggest running this 1000 times in RBE to make sure it doesn't flake too much.

Done.
When test is submitted, the routing has not moved to XdsConfigSelelctor yet, so both rpcs in step 9 will go to backend 1, because backend 0 is not ready.

This test will be modified of course with the move PR.


test/cpp/end2end/xds_end2end_test.cc, line 3755 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

This should not be necessary, nor should we need a loop in the sending thread. The sending thread should just send a single RPC and should return as soon as the RPC is done.

Done.


test/cpp/end2end/xds_end2end_test.cc, line 3761 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

In addition to setting wait_for_ready, you also need to send the RPC with no deadline. I suggest changing SendRpc() to not set a deadline if rpc_options.timeout_ms is 0.

Done.


test/cpp/end2end/xds_end2end_test.cc, line 3767 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

This needs to be done at the very top of the test, before the channel starts trying to connect.

Done.

Copy link
Copy Markdown
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Please resolve remaining comment and squash commits before merging. Thanks!

Reviewed 1 of 1 files at r2.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @donnadionne)


test/cpp/end2end/xds_end2end_test.cc, line 3734 at r1 (raw file):

Previously, donnadionne wrote…

Done.
When test is submitted, the routing has not moved to XdsConfigSelelctor yet, so both rpcs in step 9 will go to backend 1, because backend 0 is not ready.

This test will be modified of course with the move PR.

Good point. This looks good.


test/cpp/end2end/xds_end2end_test.cc, line 3687 at r2 (raw file):

  sending_rpc.join();
  // Make sure RPCs go to the correct backend:
  // Before moving routing to XdsConfigSelector, 2 to backend 1;

Please make this comment a TODO.

logic are moved from policies to XdsConfigSelector:

1. Cluster update test: we never had a test that explicitly updated
clusters multiple times.
2. A cluster update test that includes long running RPCs, in particular,
RPCs not yet committed.
@donnadionne donnadionne merged commit 0c1ca7b into grpc:master Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release notes: no Indicates if PR should not be in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants