-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Remove redundant server teams #1785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
etschannen
merged 38 commits into
apple:master
from
xumengpanda:mengxu/server-team-remover-PR
Jul 20, 2019
Merged
Remove redundant server teams #1785
etschannen
merged 38 commits into
apple:master
from
xumengpanda:mengxu/server-team-remover-PR
Jul 20, 2019
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We build more teams than we finally want so that we can use serverTeamRemover() actor to remove the teams whose member belong to too many teams. This allows us to get a more balanced number of teams per server.
98c6abf to
61c1138
Compare
vishesh
reviewed
Jul 4, 2019
Each server has the maximum of DESIRED_TEAMS_PER_SERVER and (DESIRED_TEAMS_PER_SERVER * storageTeamSize) / 2)
61c1138 to
e39c9d1
Compare
Also change state variable to variable.
e39c9d1 to
c7a9962
Compare
xumengpanda
commented
Jul 8, 2019
Pick the team whose minimum team number of a server is the largest one to remove. AddTeamsBestOf should keep building teams until each server has at least the target number of teams.
Otherwise, simulation may time out when team remover needs to remove hundreds of teams.
Also further speed up serverTeamRemover in simulation, and Add comments
…false Because serverTeamRemover takes time to remove teams, getTeamCollectionValid() need to wait for a while before concluding that the number of server teams is larger than the desired number.
xumengpanda
commented
Jul 9, 2019
2f88c88 to
600f16c
Compare
When a teamTracker is cancelled, e.g, by redundant teamRemover or badTeamRemover, we should decrease the optimalTeamCount if the team is considered as an optimal team, i.e., all members' machine fitness is no worse than unset, and the team is healthy.
600f16c to
cf935ff
Compare
…move Before the serverTeamRemover tries to pick a team to remove, it waits for all data movement to finish, which means all teams are healthy. When the serverTeamRemover starts to pick a team to remove, we believe all servers are healthy.
Also change some code format in self review
xumengpanda
commented
Jul 12, 2019
etschannen
reviewed
Jul 13, 2019
…ted number of teams
Contributor
Author
|
The commit |
Change to remove machine team with most machine teams, using the same logic as the serverTeamRemover. The featue is guarded by TR_FLAG_REMOVE_MT_WITH_MOST_TEAMS knob.
64b01cd to
415622f
Compare
Do not overbuild teams because we may oscillate between building more teams and removing the redundant teams. The oscillation happens when the machines are not evenly distributed across availability zones. For example, in three_data_hall mode, we have 1 machine in 1 data hall for 2 data halls. We have 3 machines in the 3rd data hall. To build enough (and more teams) for servers in the 3rd data hall, we will overbuild teams. However, the teamRemover will remove those newly teams.
If the minimum number of teams of servers in a team is less than the target value (desired_team_number_per_server * (teamSize + 1) / 2), the team remover should not remove it. Otherwise, DD will oscillate in building more teams and removing redundant teams. Do not do consistency check for three_data_hall mode because when machines are not evenly distributed across data halls, we will need to build more teams than the total desired number to make sure the number of teams per server is no less than the target value.
Because team remover does not remove a team if it causes 0 team per server. So we currently disable the check until we have a better strategy to enforce the desired number of teams. This will not cause much problem in real situation, while having 0 team on a server will make the server unable to host data, which is bad.
Contributor
Author
|
Commit |
etschannen
reviewed
Jul 18, 2019
1) No need to check server with only one team when teamRemover finds a server team or machine team to remove 2) Fix optimalTeamCount counting in teamTracker
d56bfdd to
64bee63
Compare
Contributor
Author
|
The latest commit |
If serverTeamRemover removes a team before machineTeamRemover brings the machine team number down to the desired number, DD may create a new team (due to teams removed by serverTeamRemover), which may be removed later by machineTeamRemover. This causes unnnecessary extra data movement.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR solves the Issue #1761.
The
serverTeamRemover()is similar to machine team remover: periodically pick a server team to remove until the total number of teams is no larger than the desired number.To make each server have similar number of teams,
serverTeamRemover()picks the server team whose members are on the largest number of server teams to remove first. In addition, when TeamCollection builds server teams, it builds more server teams than the desired number so that theserverTeamRemover()can better balance the number of teams per server.