-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Images Improvements [Breaking Changes] #461
Conversation
Just making this a breaking change cut down complexity quite a bit, will clean this up some more along with some other fixes and get it in to unblock other work like #453 |
[these will fail until a base image is pushed, also I think < 1.12 bazel
build image tags needs addressing, though perhaps we don't need to keep
supporting v1.11?]
…On Mon, Apr 29, 2019 at 1:13 PM Kubernetes Prow Robot < ***@***.***> wrote:
@BenTheElder <https://github.com/BenTheElder>: The following tests
*failed*, say /retest to rerun them all:
Test name Commit Details Rerun command
pull-kind-verify 51da10b
<51da10b>
link
<https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_kind/461/pull-kind-verify/1122953529917444097> /test
pull-kind-verify
pull-kind-conformance-parallel-1-13 51da10b
<51da10b>
link
<https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/sigs.k8s.io_kind/461/pull-kind-conformance-parallel-1-13/1122953529917444098/> /test
pull-kind-conformance-parallel-1-13
pull-kind-conformance-parallel-1-12 51da10b
<51da10b>
link
<https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/sigs.k8s.io_kind/461/pull-kind-conformance-parallel-1-12/1122953529917444100/> /test
pull-kind-conformance-parallel-1-12
Full PR test history
<https://prow.k8s.io/pr-history?org=kubernetes-sigs&repo=kind&pr=461>. Your
PR dashboard <https://gubernator.k8s.io/pr/BenTheElder>. Please help us
cut down on flakes by linking to
<https://git.k8s.io/community/contributors/devel/flaky-tests.md#filing-issues-for-flaky-tests>
an open issue
<https://github.com/kubernetes-sigs/kind/issues?q=is:issue+is:open> when
you hit one in your PR.
Instructions for interacting with me using PR comments are available here
<https://git.k8s.io/community/contributors/guide/pull-requests.md>. If
you have questions or suggestions related to my behavior, please file an
issue against the kubernetes/test-infra
<https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:>
repository. I understand the commands that are listed here
<https://go.k8s.io/bot-commands>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#461 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHADK3C3SEEXKPNXFZ3JOLPS5JGDANCNFSM4HJGJJUA>
.
|
conntrack iptables iproute2 ethtool socat util-linux mount ebtables udev kmod aufs-tools \ | ||
bash rsync \ | ||
containerd \ | ||
conntrack iptables iproute2 ethtool socat util-linux mount ebtables udev kmod \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need gnupg2 in order to build the node images with apt
/kind build node-image --base-image kindest/base:containerd --image kindest/node:containerd --type apt
E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation
(23) Failed writing body
ERRO[22:37:41] Adding Kubernetes apt key failed! exit status 255
ERRO[22:37:41] Image build Failed! exit status 255
Error: error building node image: exit status 255
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively type=apt can install gnupg2 first if we decide not to get rid of it.
We need to remove the apt node tbh.
…On Mon, Apr 29, 2019, 13:39 Antonio Ojea ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In images/base/Dockerfile
<#461 (comment)>:
> systemd systemd-sysv libsystemd0 \
- conntrack iptables iproute2 ethtool socat util-linux mount ebtables udev kmod aufs-tools \
- bash rsync \
+ containerd \
+ conntrack iptables iproute2 ethtool socat util-linux mount ebtables udev kmod \
we need gnupg2 in order to build the node images with apt
/kind build node-image --base-image kindest/base:containerd --image kindest/node:containerd --type apt
E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation
(23) Failed writing body
ERRO[22:37:41] Adding Kubernetes apt key failed! exit status 255
ERRO[22:37:41] Image build Failed! exit status 255
Error: error building node image: exit status 255
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#461 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHADK6EX4U7NLILLPK4UCDPS5MGVANCNFSM4HJGJJUA>
.
|
@@ -255,7 +255,7 @@ func (c *BuildContext) buildImage(dir string) error { | |||
}() | |||
} | |||
if err != nil { | |||
log.Errorf("Image build Failed! %v", err) | |||
log.Errorf("Image build Failed! Failed to create build container: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wound up adding some more details to these to aid debugging while working on this. They're not strictly related, but it's a small change.
} | ||
|
||
// load the docker image artifacts into the docker daemon | ||
node.LoadImages() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the equivilant is now done at image build time
@@ -57,7 +57,7 @@ func (a *Action) Execute(ctx *actions.ActionContext) error { | |||
kubeVersion, err := node.KubeVersion() | |||
if err != nil { | |||
// TODO(bentheelder): logging here | |||
return errors.Wrap(err, "failed to get kubernetes version from node: %v") | |||
return errors.Wrap(err, "failed to get kubernetes version from node") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this formatting was incorrect
|
||
// helper that calls `try()`` in a loop until the deadline `until` | ||
// has passed or `try()`returns true, returns wether try ever returned true | ||
func tryUntil(until time.Time, try func() bool) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the same code, but the only user is over here now (directly above) so I moved it.
// Deletes the machine-id embedded in the node image and regenerate a new one. | ||
// This is necessary because both kubelet and other components like weave net | ||
// use machine-id internally to distinguish nodes. | ||
if err := handle.Command("rm", "-f", "/etc/machine-id").Run(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entrypoint does this now
// we need to change a few mounts once we have the container | ||
// we'd do this ahead of time if we could, but --privileged implies things | ||
// that don't seem to be configurable, and we need that flag | ||
if err := node.FixMounts(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entrypoint does this now
return err | ||
} | ||
|
||
if nodes.NeedProxy() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entrypoint does this now
} | ||
|
||
// signal the node container entrypoint to continue booting into systemd | ||
if err := node.SignalStart(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's nothing to signal now
// SignalStart sends SIGUSR1 to the node, which signals our entrypoint to boot | ||
// see images/node/entrypoint | ||
func (n *Node) SignalStart() error { | ||
return docker.Kill("SIGUSR1", n.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably remove the docker.Kill call since this was the only user afaik. that can maybe be another PR though, this PR is already a bit big
@@ -135,105 +125,6 @@ func (n *Node) CopyFrom(source, dest string) error { | |||
return docker.CopyFrom(n.name, source, dest) | |||
} | |||
|
|||
// WaitForDocker waits for Docker to be ready on the node | |||
// it returns true on success, and false on a timeout | |||
func (n *Node) WaitForDocker(until time.Time) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary now, containerd starts very very very fast in testing so far (not worth measuring), also we don't need to wait to load images, because we don't need to load images
} | ||
|
||
// LoadImages loads image tarballs stored on the node into docker on the node | ||
func (n *Node) LoadImages() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
image loading is at build time now
/hold cancel
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: amwat, aojea, BenTheElder The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
if err := n.Command("mount", "-o", "remount,ro", "/sys").Run(); err != nil { | ||
return err | ||
} | ||
// kubernetes needs shared mount propagation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenTheElder these are omitted in fix_mount
in the entrypoint. We are adopting kind
as a cluster provider in Network Service Mesh https://github.com/networkservicemesh/networkservicemesh/blob/master/docs/guide-kind.md and the latest version is not working for us. The issues is that we are using /var/lib
mounts which is relying on /
being a shared mount.
What would be the best approach here? Shall we consider updating fix_mounts
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can update fix_mounts
, sorry about that!
These were a hold over from some previous docker-in-docker tooling predating kind, I haven't found much (anything?) documenting why you'd need to do this on the kubernetes host and conformance was passing without them, so I removed them as voodoo figuring kubelet must handle this 🤦♂
Looking again now I do see: https://kubernetes.io/docs/concepts/storage/volumes/#configuration
side note:
It is worth noting that kind as any other Kubernetes deployment tool would expect that the machine that hosts the Docker has at least 4 CPU cores and 4 GB of RAM. That is specifically pointed for OSX users in the official docs.
Filed #485, we should make those more accurate in our docs and then send y'all an update 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filed #486, will send a fix momentarily
Thanks, @BenTheElder, indeed I have found these mentions about the |
Signed-off-by: Jean-Christophe Sirot <[email protected]>
Signed-off-by: Jean-Christophe Sirot <[email protected]>
Signed-off-by: Jean-Christophe Sirot <[email protected]>
Now use official kind node images and adapt how the compose-on-kube images are loaded in kind cluster in e2e tests (images archives in /kind/images are not loaded at cluster creation anymore, see kubernetes-sigs/kind#461) Signed-off-by: Jean-Christophe Sirot <[email protected]>
…sigs#461) * version v0.18.0-alpha * update docs for v0.17.0 * fix kind version in readme * comments-update-buildcontext * trying get oci versions * integrated default behaviour * added managing for chart version * fixed getting last version * updated dependendies * fixed versions sorting * formated error * remove clusterOperatorChart variable * Added cluster_operator_image_version * Updated CHANGELOG --------- Co-authored-by: Benjamin Elder <[email protected]> Co-authored-by: Daman <[email protected]> Co-authored-by: Kubernetes Prow Robot <[email protected]>
Overall this PR will make cluster boot faster and lighter, however it is a breaking change to the images versus the kind binary, newer images will require a newer kind binary and a newer kind binary will require newer images.
Base Image Changes:
Node Image Changes:
Cluster Create Changes:
The entrypoint changes should make #148 much simpler, we only need to tackle the IP issues.