-
Notifications
You must be signed in to change notification settings - Fork 4
Deploy new epoxy-extension-server + setup_k8s.sh related changes #236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The token-server and bmc-store-password ePoxy extensions are now replaced by a new ePoxy "extension server." Instead of individual extension container images, they are now all combined into a single binary and container image that listens on a single port.
See long comment in change set for details
Previously there were separate token-server and bmc-store-password containers and systemd units. These have now been combined into a single extension server that listens on a single port.
flannel was failing to start on sandbox nodes, causing the node to be ina NotReady state because networking was not ready. The pod description had this event: "Error: failed to create containerd container: get apparmor_parser version: exec: "apparmor_parser": executable file not found in $PATH" This appears to be related to some changes going on in containerd: containerd/containerd#8087
The create-control-plane.service is supposed to run _after_ mount-data-api, but that ordering was broken because the name of the service changed and I failed to update the "After" block with the new name.
If the query to the live cluster for its version fails, then don't bother doing any version checking. The live cluster may not even exist, and possibly needs the images from this build so that it can be created.
Adds an additional, redundant check for the existence of /etc/kubernetes/admin.conf before initializing the cluster. A bug in our config caused the service unit to run even though that file existed, and kubeadm overwrote numerous things before finally erroring out. Can't hurt to add the additional check in this file. For nodes joining the cluster, wait for 90s (up from 60s) before trying to join to give the primary control plane node time to finish setting everything up. I discovered that 60s was not quite enough, and nodes joining the control plane might get a connection refused from the primary API endpoint.
On control plane machines, /etc/kubernetes is supposed to be a symlink to /mnt/cluster-data/kubernetes. When /etc/kubernetes already exists as a regular dir, then ln creates a symlink inside /etc/kubernetes, breaking the configuration and breakage of the create-control-plane service. Anyway, on control plane nodes that directory will be created automatically by kubeadm.
ePoxy extension allocate_k8s_token V2 returns all the data needed to join the cluster. This commit removes all templating from setup_k8s.sh and moves it into the physical image filesystem. It is now a static script which can fetch everything it needs from allocate_k8s_token V2.
Previously, the script assumed that all VMs were going to be part of a MIG. We have decided to have a hybrid approach with both MIGs and standard VMs, which required a few changes. Additionally, configure the script to the V2 allocate_k8s_token ePoxy extension, which returns all the data needed to join the cluster, not just the token. This also required some refactoring of the code.
robertodauria
approved these changes
May 2, 2023
Contributor
robertodauria
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 13 files at r1.
Reviewable status:complete! 1 of 1 approvals obtained (waiting on @nkinkade)
configs/virtual_ubuntu/opt/mlab/bin/join-cluster.sh line 71 at r1 (raw file):
extension_v1="{\"v1\":{\"hostname\":\"${hostname}\",\"last_boot\":\"$(date --utc +%Y-%m-%dT%T.%NZ)\"}}" # Fetch cluser bootstrap join data from the epoxy-extension-server.
Typo: cluster
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is primarily around deploying the new epoxy-extension-server to API machines, and removing the old token-server and bmc-store-password services from API machines.
Additionally, there are changes to the physical machine images, which now embed the script
setup_k8s.shinto the filesystem, and the script is no longer a template but instead leverages the new V2 allocate_k8s_token extension, which returns all the data the machine needs to join the cluster.setup_k8s.shis now a static script.Beyond that, this PR also contains a few bug fixes in the
create-control-plane.shscript which I discovered while testing all of this.This change is