Skip to content

Use Body::size_hint to determine whether a request is retryable#1345

Merged
olix0r merged 10 commits intoeliza/fix-retry-clfrom
ver/can-retry-simplify
Nov 1, 2021
Merged

Use Body::size_hint to determine whether a request is retryable#1345
olix0r merged 10 commits intoeliza/fix-retry-clfrom
ver/can-retry-simplify

Conversation

@olix0r
Copy link
Member

@olix0r olix0r commented Oct 31, 2021

Currently, we look at each request's content-length header to
determine whether a request is too large to buffer for retries. This is
unnecessary, as hyper will already read the content-length and use it
populate the Body::size_hint value.

This change modifies the ReplayBody constructor (now
ReplayBody::try_new) to check the body's size hint and fails to
construct a ReplayBody if the lower bounds of the body size exceeds
the maximum buffer size.

In doing so, we update all Body implementations to ensure that
size_hint is properly implemenented. We also update the
ReplayBody::size_hint implementation to be more robust, returning the
original size hint if a clone of ReplayBody does not currently hold
the inner body.

This change rewrites `RetryPolicy::can_retry` as
`RetryPolicy::body_too_big` and updates it to avoid reading
`content-length` header, instead relying on the `Body::size_hint` value
(which is set by the HTTP server if a `content-length` header is
present.

Renaming the function helps us express what the function actually checks
to help disambiguate it from all of the other 'retry' related
terminology in this file and is more explicit about what it does: we
don't precicesly know if the request is retryable, rather we know
whether the request is obviously too large to be retried. This also lets
us simplify comments and logging as the flow is more self-explanatory.

This change also adds unit tests for `RetryPolicy::body_too_big` to
exercise the variety of scenarios this function handles..

Furthermore, this change ensures that all `Body` implementations
correctly proxy `Body::size_hint`. `ReplayBody` is updated to cache the
the inner body's size hint value in the shared state so it can be reused
across all clones of the `ReplayBody` regardless of whether the current
clone holds the body state or not.
Comment on lines -846 to -847
let has_body_header = req.headers().contains_key("transfer-encoding")
|| req.headers().contains_key("content-length");
Copy link
Member Author

@olix0r olix0r Oct 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea why this test--which only mentions transfer-encoding in its description--also checks for content-length... Now that we're properly preserving the size hint, we can set the content-length header to 0 if we know it's empty. This seems like a strictly good thing.

@olix0r olix0r marked this pull request as draft October 31, 2021 15:52
Currently, we look at each request's `content-length` header to
determine whether a request is too large to buffer for retries. This is
unnecessary, as hyper will already read the `content-length` and use it
populate the `Body::size_hint` value.

This change modifies the `ReplayBody` constructor (now
`ReplayBody::try_new`) to check the body's size hint and fails to
construct a `ReplayBody` if the lower bounds of the body size exceeds
the maximum buffer size.

In doing so, we update all `Body` implementations to ensure that
`size_hint` is properly implemenented. We also update the
`ReplayBody::size_hint` implementation to be more robust, returning the
original size hint if a clone of `ReplayBody` does not currently hold
the inner body.
@olix0r olix0r changed the title Simplify retry policy to use Body::size_hint Use Body::size_hint to determine whether a request is retryable Oct 31, 2021
@olix0r olix0r marked this pull request as ready for review October 31, 2021 16:28
.as_ref()
.map(|rest| {
let hint = rest.size_hint();
(hint.lower(), hint.upper().unwrap_or(0))
Copy link
Member Author

@olix0r olix0r Nov 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that hint.upper().unwrap_or(0) is a subtle bug -- if the lower bound is non-zero, we'll end up having an upper limit that is less than the lower limit, which is bad. It's better to omit the upper limit if the inner body does not specify it.

Copy link
Contributor

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, this seems good to me --- i hadn't realized that size_hint would be set from the request's content-length; i had assumed it was just the size of any currently-received body chunks. this definitely seems like the right approach since it will also let us make better decisions in the case where the whole body has been received but it didn't have a content-length.

i had some minor nits and suggestions, but no hard blockers --- let me know what you think?

}
Either::B(req)
let (head, body) = req.into_parts();
let replay_body = match ReplayBody::try_new(body, MAX_BUFFERED_BYTES) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really not important at all, but i kind of feel like the prepare_request method should be responsible for checking the size_hint and ReplayBody should just have an unconditional new constructor...that kind of feels like a better separation of concerns to me, with the retry policy code being responsible for determining if the body is too large to retry, and ReplayBody just providing the buffering mechanism.

but, since ReplayBody does still need to be aware of the body size limit so that it can stop buffering if necessary, maybe this is fine...either way, it's not a big deal...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially had it the way you suggest, but I think I liked to make it impossible to create a ReplayBody that is 'illegal' -- this means that the redundant check in clone_request becomes totally unnecessary: by virtue of having constructed a ReplayBody, we know that it doesn't violate our expectations on body size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough, that's a good point!

Comment on lines +138 to +151
let replay_body = match ReplayBody::try_new(body, MAX_BUFFERED_BYTES) {
Ok(body) => body,
Err(body) => {
tracing::debug!(
size = body.size_hint().lower(),
"Body is too large to buffer"
);
return Either::B(http::Request::from_parts(head, body));
}
};

// The body may still be too large to be buffered if the body's length was not known.
// `ReplayBody` handles this gracefully.
Either::A(http::Request::from_parts(head, replay_body))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, take it or leave it: the early return here feels slightly convoluted, can't this just be

Suggested change
let replay_body = match ReplayBody::try_new(body, MAX_BUFFERED_BYTES) {
Ok(body) => body,
Err(body) => {
tracing::debug!(
size = body.size_hint().lower(),
"Body is too large to buffer"
);
return Either::B(http::Request::from_parts(head, body));
}
};
// The body may still be too large to be buffered if the body's length was not known.
// `ReplayBody` handles this gracefully.
Either::A(http::Request::from_parts(head, replay_body))
match ReplayBody::try_new(body, MAX_BUFFERED_BYTES) {
Ok(replay_body) => {
// The body may still be too large to be buffered if the body's length was not known.
// `ReplayBody` handles this gracefully.
Either::A(http::Request::from_parts(head, replay_body))
>,
Err(body) => {
tracing::debug!(
size = body.size_hint().lower(),
"Body is too large to buffer"
);
Either::B(http::Request::from_parts(head, body))
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I had it that way initially. but it feels like the secondary case is exceptional, so it feels appropriate to early return. don't feel strongly either way, though.

Comment on lines +334 to 336
if let Some(rest_upper) = rest_hint.upper() {
hint.set_upper(buffered + rest_upper);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, if rest doesn't have an upper bound, can't we just use the max buffered length as the upper bound, instead of not returning one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we don't know that it's actually the upper limit of the body, do we? My understanding is that ReplayBody should gracefully handle the case when the inner body exceeds the buffer capacity?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yeah, it's only the upper limit if we're a subsequent clone of the body --- if we're the initial body, we will continue reading body chunks if we go past the limit; we'll just stop buffering them. we probably could add a check here and only return the max length if we didn't get an upper bound from rest_hint and we know we're not the original body...but, this is probably fine as-is and it may not be worth bothering with that...

@olix0r olix0r merged commit dae3917 into eliza/fix-retry-cl Nov 1, 2021
@olix0r olix0r deleted the ver/can-retry-simplify branch November 1, 2021 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants