Skip to content
This repository was archived by the owner on Nov 17, 2018. It is now read-only.

Conversation

@rynowak
Copy link
Member

@rynowak rynowak commented Feb 14, 2018

The makings of some integration for using HttpClientFactory and Polly together.

Right now my focus is on nailing down the basics of the public-facing API:

  • What are the most common scenarios?
  • How does a beginner figure out what to do?
  • Does an expert have enough control?

My goals is to answer those questions (with feedback) along with the the following non-goals:

  • Don't lock users out of powerful but complicated things
  • Don't try to address every possible scenario and combination
  • Don't hide or abstract away Polly's APIs

@rynowak
Copy link
Member Author

rynowak commented Feb 14, 2018

Pinging everyone who's commented on the issue thread, I welcome your thoughts and feedback. @stevejgordon @theezak @reisenberger and @joelhulen as well as @glennc @javiercn and @anurse for good measure


namespace Microsoft.Extensions.Http
{
public class PolicyHttpMessageHandler : DelegatingHandler
Copy link
Member Author

@rynowak rynowak Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key building block here is a message handler that plugs in a policy to the pipeline. Ideally this is as generic as possible.

Not seen here: we'll want logging as well as that's one of the best ways that we add value by providing this. I'd like to defer discussions about that until after the initial API for configuring clients is designed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to this highly general PolicyHttpMessageHandler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reisenberger - do you think that we'll end up having a generic and non-generic policy handler? That seems logical to me based on the desire to provide both:

public static IHttpClientBuilder AddPolicyHandler(this IHttpClientBuilder builder, IAsyncPolicy policy)
public static IHttpClientBuilder AddPolicyHandler(this IHttpClientBuilder builder, IAsyncPolicy<HttpResponseMessage> policy)

Do you agree with this direction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

While only pure IAsyncPolicy<HttpResponseMessage> policies would be needed to govern SendAsync(...), common Polly syntax (for example Policy.TimeoutAsync(10)) results more naturally in a non-generic IAsyncPolicy, so not supporting IAsyncPolicy might frustrate users.

Generic policies are marginally more performant but any difference is in the order of microseconds or less.

.AddServerErrorPolicyHandler(p => p.RetryAsync())

// Build a policy that will handle exceptions, 400s, and 500s from the remote server
.AddBadRequestPolicyHandler(p => p.RetryAsync())
Copy link
Member Author

@rynowak rynowak Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the best place to start the discussion.

I understand the requirements for building a policy to be:

  • what kinds of things do I consider an error?
  • what do I want to do when I find one?

I think there's some value in us providing guidance about the first question (what should I consider an error), and I think the second question is easy enough for users to answer (p => p.RetryAsync()).

I've defined a few flavors of extension methods here. I think AddServerErrorPolicyHandler is probably going to be the most useful balance between "sane default" and hiding complexity. Most folks will always want to treat a connection failure or a 500 as an error.

If you use AddPolicyHandler to build AddServerErrorPolicyHandler it will end up looking like:

.AddPolicyHandler(
  Policy
    .Handle<HttpRequestException>()
    .OrResult<HttpResponseMessage>(r => r.StatusCode >= HttpStatusCode.InternalServerError)
    .RetryAsync());

I think it's a good enough for us to find an attractive name for this and include it 👍

A few questions for discussion:

  • Handlers may be long-lived or short-lived. Is it important we reuse policies? It is important that we don't?
  • There's a little bit of tension around whether or not things should be IAsyncPolicy or IAsyncPolicy<T> in some of the simple cases, so I planned to support both. Is this a good plan?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I jump in even if not pinged. Policies' lifecycle should be independent from the handler's one because there are policies like the short-circuit one that are designed to be "singleton-ish".

Ideally policies should be attached to freshly generated handler's but it feels like it leads to a pool manager for the policies to be used by a pool manager for the handlers...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kralizek thanks!

The factory already provides some pooling and lifecycle management for handlers (docs in progress).

  • It's easy to make a policy a singleton.
  • It's also easy to make a policy have the same lifetime as a handler.
  • It would be a bit tricky to do something different, and I'd like to avoid it if possible

Policies' lifecycle should be independent from the handler's one because there are policies like the short-circuit one that are designed to be "singleton-ish".

This makes intuitive sense to me.

Are there real use cases for making policies that have the same lifetime as the handler? Should they all be singleton?

Copy link
Contributor

@reisenberger reisenberger Feb 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responding to @rynowak and @Kralizek on policy lifecycle:

TL;DR (I have assumed...) Policies should be single-instance for each policy configured for use with a given named HttpClient configuration.

For example, if a Polly policy for a given named HttpClient is config'd by supplying a createPolicy() factory Func, that createPolicy() func should be used to create one instance of that policy, and that single policy instance supplied to all subsequently created handlers/HttpClient instances of that named configuration.


Explanation

Two policy types maintain state across calls for their designed function: circuit-breakers (tracking call faults/successes) and bulkheads (limiting parallelism).

If HttpClientFactory were to create fresh instances when recycling handlers, that state would suffer either discontinuity (if new instances replaced old instances 'instantaneously') or fragmentation/multiplicity (if new and old instances co-existed).

eg Bulkhead: Generating fresh BulkheadPolicy instances for new handlers could cause executions to exceed the parallelism configured by the developer, as double/multiple capacity would be available while both newly-spun-up and existing / being-retired instances co-existed.

No other policies maintain state, but all Polly policies are entirely thread-safe and memory-efficient for long lifetime, so there's no real case I can see for shorter lifetimes; it'd just be more allocations and GCs.


Does this sound right to you?


<PropertyGroup>
<Description>
The HttpClient factory is a pattern for configuring and retrieving named HttpClients in a composable way. This package provides integration for IHttpClientFactory using the Polly library for exception handling and fault-tolerance.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reisenberger @joelhulen - feel free to provide feedback about how we're describing your project 😆

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a bad start :)

I'd perhaps word the second sentence as such:

This package integrates IHttpClientFactory with the Polly library, to add transient-fault-handling and resiliency through fluent policies such as Retry, Circuit Breaker, Timeout, Bulkhead Isolation, and Fallback.

@reisenberger, do you agree?

throw new ArgumentNullException(nameof(request));
}

return _policy.ExecuteAsync(() => base.SendAsync(request, cancellationToken));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Polly supports handing the cancellation token to the policy itself as well so it can be used by its internals (at least that's my understanding), which is passed through into the lambda doing the actual work. Should the cancellation token be used that way here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, eg:

return _policy.ExecuteAsync(token => base.SendAsync(request, token), cancellationToken);

An example benefit: if the user has configured a wait-and-retry policy, a cancellationToken passed through the policy in this manner also governs the wait phase of wait-and-retry, so that signalling the cancellationToken will cause the policy to abandon the wait immediately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is just something I missed. Will fix in the next iteration.

Copy link

@analogrelay analogrelay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me. Happy to let the Polly folks take the lead on what API makes sense here.

//builder.AddHttpMessageHandler(() => new PolicyHttpMessageHandler(createPolicy()));
return builder;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Polly policies divide into reactive policies (Retry; CircuitBreaker; Fallback), and proactive and preemptive policies (Cache; Bulkhead; Timeout). The reactive policies are the ones which flow from PolicyBuilder, so the choice of configuration syntax here with variants around Func<PolicyBuilder..., IAsyncPolicy...> focuses on (only) the Retry, CircuitBreaker and Fallback policies.

I'm drawing this out in case it wasn't intended, but it seems reasonable, as Retry and CircuitBreaker seem likely majority use cases for Polly with HttpClientFactory. And for those who want to go further, the more advanced overloads (eg AddPolicyHandler(this IHttpClientBuilder builder, IAsyncPolicy policy) line 14) do provide.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this was deliberate, but I agree with your judgement that Retry and CircuitBreaker are likely to be the most commonly used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. This will be good info for writing the docs.

return builder;
}

public static IHttpClientBuilder AddExceptionPolicyHandler(this IHttpClientBuilder builder, Func<PolicyBuilder, IAsyncPolicy<HttpResponseMessage>> createPolicy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point of detail: this overload is not needed. Polly fluent syntax happens to dictate that PolicyBuilder must already be generic PolicyBuilder<TResult> if the eventual policy is to be generic in TResult. (no configuration overloads flow from PolicyBuilder to IAsyncPolicy<TResult>.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! will remove this in the next iteration.

//builder.AddHttpMessageHandler(() => new PolicyHttpMessageHandler(createPolicy()));
return builder;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there as clear motivation for an AddBadRequestPolicyHandler() covering the 4xx codes generally, as for the other overloads? Where Http status codes in the 4xx range indicate client errors, there may be no point in retrying these (eg if a request is badly formed 400, resubmitting it is usually not going to make it succeed). Likewise, one typically might not want to take client-errors as cause to break a circuit-breaker.

Some specific exceptions:

  • 408 (afaiu) the server signalling giving up waiting for the client's request to arrive complete; it could reflect wider network connectivity which merits a retry.
  • 429 RetryAfter: Polly has retry overloads which can handle this case (extract how long to wait before the next try, from the 429 response), but it remains a fairly specific configuration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there as clear motivation for an AddBadRequestPolicyHandler() covering the 4xx codes generally, as for the other overloads?

Not really. That was just a straw man for us to beat up.

I still want to get more feedback on whether we should have both AddExceptionPolicyHandler and AddServerErrorPolicyHandler.. When might you want to apply a policy to a connection failure but not a 500? Still needs more thought IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[should we] have both AddExceptionPolicyHandler and AddServerErrorPolicyHandler?
When might you want to apply a policy to a connection failure but not a 500?

If the questions asked by users to the Polly team are a relevant measure, then handling them differently seems at least uncommon. We received a lot of requests for policies to handle result codes as well as exceptions, but since adding that, no requests for help how to handle them differently. My feeling (fwiw) is the majority use case defines a common strategy for all kinds of transient fault at an endpoint.

If AddExceptionPolicyHandler and AddServerErrorPolicyHandler were merged, the power overloads you are proposing would still provide, for those who want more nuanced strategies.

Still interested what others may think.


The two enquiries the Polly team handles periodically (I would not say frequently) on behaviour for specific result codes are:

We also in the Polly doco recommend retrying on 408s, similar to advice from the Azure team.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth considering if we want to follow up on a convenience method for 'retryable' faults. Adding it to my todo list.

Regarding AuthZ, we plan to do that in the future (not this release). I'm not sure that plan is related to Polly, but we're interested in your thoughts as well.

Copy link
Contributor

@reisenberger reisenberger Feb 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth considering if we want to follow up on a convenience method for 'retryable' faults.

Happy to hear more what is the thinking here. Is this around helpers for commoner cases of how one might handle (common retry strategies such as exponential backoff), or something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to hear more what is the thinking here. Is this around helpers for commoner cases of how one might handle (common retry strategies such as exponential backoff), or something else?

I think we'd want to provide a more opinionated approach to what we consider a 'retry-able' request. The starting point would be the info that you've already shared with us from your docs/guidance.

I think the polly API for creating different retry policies is already sufficient.

If you have more ideas about this, feel free to open an issue for more focused discussion. Your feedback has been really valuable.

@stevejgordon
Copy link
Contributor

I like where this is headed 👍

@aristovg
Copy link

aristovg commented Feb 15, 2018

Sorry for jumping in, but have you considered the HttpClient timeout when doing retries within delegating handler? Consider the following scenario:

  • HttpClient timeout is set to 10 seconds
  • Retry policy has 5 attempts
  • Each attempts takes 5 seconds (either it is set by retry policy or server is responding at that rate).
    This will result in only 2 retries being made instead of 5.

Any thoughts on that?

@stevejgordon
Copy link
Contributor

I was playing with this briefly last night and thinking about the scenario we'd likely have were we like to register and reuse policies. This is what I ended up with...

NOTE: It's a bit rushed and I didn't have time to / couldn't actually get the sample running so this may very well not work in practice!

Add an extension:

public static IHttpClientBuilder AddPolicyHandler<T>(this IHttpClientBuilder builder, string namedPolicy) where T : IPolicyRegistry<string>
{
	IReadOnlyPolicyRegistry<string> registry;

	using (var sp = builder.Services.BuildServiceProvider())
	{
		registry = sp.GetService<T>();
	}
	
	if (registry != null)
	{
		if (registry.TryGet<IAsyncPolicy<HttpResponseMessage>>(namedPolicy, out var policy))
		{
			builder.AddHttpMessageHandler(() => new PolicyHttpMessageHandler(policy));
		}
	}

	return builder;
}

Basic consumption...

public static void Configure(IServiceCollection services)
{
	services.AddSingleton<PolicyRegistry>(s =>
	{
		var registry = new PolicyRegistry();

		var policy = Policy.Handle<HttpRequestException>().RetryAsync();

		registry.Add("mypolicy", policy);

		return registry;
	});

	services.AddHttpClient("github", c =>
	{
		c.BaseAddress = new Uri("https://api.github.com/");

		c.DefaultRequestHeaders.Add("Accept", "application/vnd.github.v3+json"); // Github API versioning
		c.DefaultRequestHeaders.Add("User-Agent",
			"HttpClientFactory-Sample"); // Github requires a user-agent
	})

	.AddPolicyHandler<PolicyRegistry>("mypolicy");
}

I'm not sure if this is a) a good approach b) useful as part of the main library but interested in thoughts on this style of usage!

@rynowak
Copy link
Member Author

rynowak commented Feb 16, 2018

Thanks for all of the comments, I'm super excited to see so much interest. I'm going to go through and respond to everyone's feedback, and I'll create a more complete version of this PR over the weekend.

@rynowak
Copy link
Member Author

rynowak commented Feb 16, 2018

Responding to a comment from @aristovg

Sorry for jumping in, but have you considered the HttpClient timeout when doing retries within delegating handler? Consider the following scenario:

HttpClient timeout is set to 10 seconds
Retry policy has 5 attempts
Each attempts takes 5 seconds (either it is set by retry policy or server is responding at that rate).
This will result in only 2 retries being made instead of 5.
Any thoughts on that?

For reference the default timeout is 100 seconds.

I guess I'd say that with a scheme like this you can configure the overall timeout of the system (including retries) using the timeout on HttpClient. You can configure the timeout on the actual HTTP requests by using another handler (after the policy) with a stricter timeout enforced by a cancellation token.

Polly has policies for timeout as well, so I'd assume that once I address this feedback that would 'just work'.

It's a good point and I'm glad that you brought it up. We will need to document this.

@rynowak rynowak mentioned this pull request Feb 16, 2018
6 tasks
@rynowak
Copy link
Member Author

rynowak commented Feb 16, 2018

Responding to a comment from @stevejgordon

I was playing with this briefly last night and thinking about the scenario we'd likely have were we like to register and reuse policies. This is what I ended up with...

Yeah, I like the direction of what you came up with. This seems promising. I'm tracking this for follow up here: #8

@rynowak
Copy link
Member Author

rynowak commented Feb 16, 2018

/cc @sebastienros

@reisenberger
Copy link
Contributor

reisenberger commented Feb 16, 2018

Responding to @aristovg and @rynowak re timeouts:

Sorry for jumping in, but have you considered the HttpClient timeout when doing retries within
delegating handler? Consider the following scenario:

HttpClient timeout is set to 10 seconds
Retry policy has 5 attempts
Each attempts takes 5 seconds (either it is set by retry policy or server is responding at that rate).
This will result in only 2 retries being made instead of 5.
Any thoughts on that?

[...] You can configure the timeout on the actual HTTP requests by using another handler (after the
policy) with a stricter timeout enforced by a cancellation token.
Polly has policies for timeout as well [...] so I'd assume [...] that would 'just work'.

Exactly so. Polly's TimeoutPolicy functions by this same mechanism, so one could:

.AddServerErrorPolicyHandler(p => p.RetryAsync(/* detailed  configuration */))
.AddPolicyHandler(Policy.TimeoutAsync(10)) // impose 10-second timeout on each individual retry

Additionally, rather than only letting the TaskCanceledException or OperationCanceledException propagate, Polly's TimeoutPolicy throws a wrapping Polly.Timeout.TimeoutRejectedException, if specifically distinguishing timeout to be the cause. This helps any wrapping RetryPolicy distinguish timeout cancellation from other cancellation: for instance, one commonly wants to retry on timeout but bail out immediately (not retry) on user/caller cancellation.


Great to have drawn this point out and agree that the role of each timeout merits documentation.

@rynowak
Copy link
Member Author

rynowak commented Feb 21, 2018

Updated this. I think I've addressed all of the important feedback that isn't tracked somewhere else.

I want to come to an understanding still on https://github.com/aspnet/HttpClientFactory/pull/58/files#r169533010

}

// Important - cache policy instances so that they are singletons per handler.
var innerPolicy = policy.WrapAsync(Policy.NoOpAsync<HttpResponseMessage>());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reisenberger - what do you think of this approach instead of creating two separate code paths?

Having two separate code paths just seems a bit like clutter - and then there's naming to think about 😲..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var innerPolicy = policy.WrapAsync(Policy.NoOpAsync<HttpResponseMessage>());

I like how this simplifies the separate codepaths; that concerned me too.

@rynowak
Copy link
Member Author

rynowak commented Feb 21, 2018

It sounds like we've pretty much converged at this point, and there are lots of follow up items 👍

I'm going to go ahead and merge this once I do all the other fancy build stuff I need to do in our private repos.

Adds an Microsoft.Extensions.Http.Polly project for convenience
functionality to integrate Polly with the IHttpClientFactory.

This is just basic building blocks now, and more is coming.
@rynowak rynowak merged commit 157cb89 into dev Feb 22, 2018
@rynowak rynowak deleted the rynowak/polly branch February 22, 2018 01:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants