Skip to content

Conversation

@tclinken
Copy link
Contributor

@tclinken tclinken commented Apr 22, 2019

This is a continuation of PR #1229. It fixes some bugs from that pull request.

@kaomakino
Copy link
Contributor

Screen Shot 2019-05-07 at 5 14 01 PM
These charts show how the local ratekeeper protects us from the heavily skewed read write workload.
I used a highly skewed workload which generates heavy range reads and writes targeting a particular single team. Without the local ratekeeper, when a storage server gets too many read requests, it cannot process pending write tasks, so the NDV won't decrease. With the local ratekeeper, when NDV goes up to certain threshold, it starts throttling the reads (Shown in the most left 2 charts). While the reads are throttled, the storage server can process the write tasks, so the NDV can go down.

In comparison, here are the charts with the current master with the same workload.
Screen Shot 2019-05-07 at 5 56 47 PM
Note that NDV does not come down because of the heavy incoming reads.

@xumengpanda
Copy link
Contributor

Screen Shot 2019-05-07 at 5 14 01 PM
These charts show how the local ratekeeper protects us from the heavily skewed read write workload.
I used a highly skewed workload which generates heavy range reads and writes targeting a particular single team. Without the local ratekeeper, when a storage server gets too many read requests, it cannot process pending write tasks, so the NDV won't decrease. With the local ratekeeper, when NDV goes up to certain threshold, it starts throttling the reads (Shown in the most left 2 charts). While the reads are throttled, the storage server can process the write tasks, so the NDV can go down.

In comparison, here are the charts with the current master with the same workload.
Screen Shot 2019-05-07 at 5 56 47 PM
Note that NDV does not come down because of the heavy incoming reads.

This is an insightful evaluation.

While I'm trying to understand the figures in the evaluations, I'm confused at what each line means in the figures ReadOPS, WriteOPS and NDV. Maybe you posted the legend for those figures before but I didn't know where it is?

@kaomakino
Copy link
Contributor

ReadOPS = Read operations / second
WriteOPS = Write operations / second
NDV = Non-Durable Version (= Current Version - Durable Version)
These are time-series chart. X-axis is time in seconds.

@xumengpanda
Copy link
Contributor

ReadOPS = Read operations / second
WriteOPS = Write operations / second
NDV = Non-Durable Version (= Current Version - Durable Version)
These are time-series chart. X-axis is time in seconds.

I think I didn't explain myself clearly. I understood those abbreviations.

What I asked is what do the lines in different colors mean in each of those figures.
I guess each line in the figures is the data for one client?

@kaomakino
Copy link
Contributor

Ah, got it. Each line represents each storage process. In the NDV chart, there are 3 lines moving because one team (triple redundancy) is being hot.

@kaomakino
Copy link
Contributor

In addition to those per-server-process charts, the "Local Ratekeeper" chart shows the aggregate of the "probability of serving reads" of all storage processes. The "Ratekeeper" chart and "Worst Storage Queue" are taken from the ratekeeper metrics.

@etschannen
Copy link
Contributor

My impression for local ratekeeper was that ratekeeper would only allow two (with triple replication) storage servers that they were allowed to start limiting their reads.

This implementation lets all storage servers limit themselves, and additionally will cause the main ratekeeper to throttle if they fall too far behind. With this implementation I am concerned that a saturating workload could cause all storage servers to each decide to throttle reads. This would lead to higher read latencies, which could cause transactions to start taking longer than 5 seconds, and lead to a death spiral.


double getPenalty() {
return std::max(1.0, (queueSize() - (SERVER_KNOBS->TARGET_BYTES_PER_STORAGE_SERVER - 2.0*SERVER_KNOBS->SPRING_BYTES_STORAGE_SERVER)) / SERVER_KNOBS->SPRING_BYTES_STORAGE_SERVER);
return std::max(std::min(1.0, (queueSize() - (SERVER_KNOBS->TARGET_BYTES_PER_STORAGE_SERVER -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The penalty should always be larger than 1. Load balance works by trying to keep an equal number of requests outstanding to all servers, a penalty makes a single outstanding request act like more than one request so you send less traffic to that server. I think the formula you want for penalty would be 1.0/(1.0-currentRate()), and everything should me max instead of having a min.

@tclinken tclinken force-pushed the features/local-rk branch from 6f0af55 to 8dbb231 Compare June 5, 2019 22:43
}

ACTOR Future<Void> updateStorage(StorageServer* data) {
state std::string waitDescription = format("%s/updateStorage", data->thisServerID.toString().c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass this string directly into checkDisabled so we do not construct it when not in simulation

@etschannen etschannen merged commit 9fdbf0c into apple:master Jun 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants