Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cluster-role for cilium to prevent errors in agent startup #11466

Merged
merged 3 commits into from
Sep 29, 2024

Conversation

foobaar
Copy link
Contributor

@foobaar foobaar commented Aug 20, 2024

ciliumloadbalancerippools permissions exists in the cilium helm chart for version 1.13.0 https://github.com/cilium/cilium/blob/v1.13.0/install/kubernetes/cilium/templates/cilium-agent/clusterrole.yaml#L71

The agent also needs permissions to read/watch secrets for bgp auth secrets when using CiliumBGPPeeringPolicy with a secret.
Without the permissions for secrets, we see the following warns in the logs when cilium starts up on a node:

is forbidden: User \"system:serviceaccount:kube-system:cilium\" cannot list resource \"secrets\" in API group \"\" in the namespace \"kube-system\"" subsys=k8s
time="2024-08-20T14:18:16Z" level=warning msg="github.com/cilium/cilium/pkg/k8s/resource/resource.go:808: failed to list *v1.Secret: secrets is forbidden: User \"system:serviceaccount:kube-system:cilium\" cannot list resource \"secrets\" in API group \"\" in the namespace \"kube-system\"" subsys=klog
time="2024-08-20T14:18:16Z" level=error msg=k8sError error="github.com/cilium/cilium/pkg/k8s/resource/resource.go:808: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User \"system:serviceaccount:kube-system:cilium\" cannot list resource \"secrets\" in API group \"\" in the namespace \"kube-system\"" subsys=k8s

What type of PR is this?
/kind bug

What this PR does / why we need it:
Gives the cilium agent required permissions to watch loadbalancerippools and secrets

Does this PR introduce a user-facing change?:
NONE

Release Note:

Fix Cilium agent permission can't read loadbalancerippools and secrets

** Cilium Config

apiVersion: v1
data:
  agent-health-port: "9879"
  auto-direct-node-routes: "False"
  bgp-secrets-namespace: kube-system
  bpf-ct-global-any-max: "262144"
  bpf-ct-global-tcp-max: "524288"
  bpf-lb-acceleration: disabled
  bpf-lb-mode: dsr
  bpf-map-dynamic-size-ratio: "0.0025"
  cgroup-root: /run/cilium/cgroupv2
  clean-cilium-bpf-state: "false"
  clean-cilium-state: "false"
  cluster-name: default
  cluster-pool-ipv4-cidr: 10.Y.0.0/14
  cluster-pool-ipv4-mask-size: "25"
  cni-exclusive: "True"
  cni-log-file: /var/run/cilium/cilium-cni.log
  custom-cni-conf: "false"
  debug: "False"
  disable-cnp-status-updates: "True"
  enable-bgp-control-plane: "true"
  enable-bpf-clock-probe: "True"
  enable-bpf-masquerade: "False"
  enable-host-legacy-routing: "True"
  enable-hubble: "true"
  enable-ip-masq-agent: "False"
  enable-ipv4: "True"
  enable-ipv4-masquerade: "True"
  enable-ipv6: "False"
  enable-ipv6-masquerade: "True"
  enable-l2-announcements: "False"
  enable-metrics: "true"
  enable-remote-node-identity: "True"
  enable-well-known-identities: "False"
  etcd-config: |-
    ---
    endpoints:
      - https://10.X.0.36:2379
      - https://10.X.0.37:2379
      - https://10.X.0.38:2379
      - https://10.X.0.39:2379
      - https://10.X.0.40:2379

    # In case you want to use TLS in etcd, uncomment the 'ca-file' line
    # and create a kubernetes secret by following the tutorial in
    # https://cilium.link/etcd-config
    ca-file: "/etc/cilium/certs/ca_cert.crt"

    # In case you want client to server authentication, uncomment the following
    # lines and create a kubernetes secret by following the tutorial in
    # https://cilium.link/etcd-config
    key-file: "/etc/cilium/certs/key.pem"
    cert-file: "/etc/cilium/certs/cert.crt"
  hubble-disable-tls: "false"
  hubble-listen-address: :4244
  hubble-metrics: dns drop tcp flow icmp http
  hubble-metrics-server: :9965
  hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt
  hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt
  hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key
  identity-allocation-mode: kvstore
  ipam: cluster-pool
  ipv4-native-routing-cidr: 10.0.0.0/8
  kube-proxy-replacement: strict
  kvstore: etcd
  kvstore-opt: '{"etcd.config": "/var/lib/etcd-config/etcd.config"}'
  monitor-aggregation: medium
  monitor-aggregation-flags: all
  operator-api-serve-addr: 127.0.0.1:9234
  operator-prometheus-serve-addr: :9963
  preallocate-bpf-maps: "False"
  prometheus-serve-addr: :9962
  routing-mode: native
  sidecar-istio-proxy-image: cilium/istio_proxy
  write-cni-conf-when-ready: /host/etc/cni/net.d/05-cilium.conflist

ciliumloadbalancerippools permissions exists in the cilium helm chart for version 1.13.0
https://github.com/cilium/cilium/blob/v1.13.0/install/kubernetes/cilium/templates/cilium-agent/clusterrole.yaml#L71

The agent also needs permissions to read/watch secrets for bgp auth secrets when using CiliumBGPPeeringPolicy with a secret.
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Aug 20, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @foobaar. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tico88612
Copy link
Member

@foobaar Please add release note block.

Fix Cilium agent permission can't read loadbalancerippools and secrets

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 21, 2024
@tico88612
Copy link
Member

Please fix release note block.

Like this:

```release-note
```

@tico88612
Copy link
Member

/retest

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 21, 2024
@foobaar
Copy link
Contributor Author

foobaar commented Aug 21, 2024

Updated release note

@tico88612
Copy link
Member

/retest

Copy link
Member

@tico88612 tico88612 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 22, 2024
@tico88612
Copy link
Member

/retest-failed

@foobaar
Copy link
Contributor Author

foobaar commented Sep 5, 2024

Hey is there anything additional you want from me on this MR?

@tico88612
Copy link
Member

@yankay This PR is about cilium agent fix. Please take a look, thanks.

@@ -28,6 +28,7 @@ rules:
- pods
- endpoints
- nodes
- secrets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT the secrets permission in the helm charts is only get for secrets, should this be as tight here ?

Copy link
Contributor Author

@foobaar foobaar Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the following errors in Cilium when i dont give watch/list

When I only have get permissions for secrets, I get the following Warn:

time="2024-09-05T18:09:04Z" level=warning msg="github.com/cilium/cilium/pkg/k8s/resource/resource.go:808: failed to list *v1.Secret: secrets is forbidden: User \"system:serviceaccount:kube-system:cilium\" cannot list resource \"secrets\" in API group \"\" in the namespace \"kube-system\"" subsys=klog
time="2024-09-05T18:09:04Z" level=error msg=k8sError error="github.com/cilium/cilium/pkg/k8s/resource/resource.go:808: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User \"system:serviceaccount:kube-system:cilium\" cannot list resource \"secrets\" in API group \"\" in the namespace \"kube-system\"" subsys=k8s

When I only have get,list permissions for secrets, I get the following Warn:

time="2024-09-05T18:11:59Z" level=error msg=k8sError error="github.com/cilium/cilium/pkg/k8s/resource/resource.go:808: Failed to watch *v1.Secret: unknown (get secrets)" subsys=k8s

Copy link
Contributor

@cyclinder cyclinder Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @foobaar, Did you see the errors in cilium-agent pod? Look at https://github.com/cilium/cilium/blob/c9723a8df3cfa336da1f8457a864105d8349acfe/install/kubernetes/cilium/templates/cilium-agent/clusterrole.yaml#L66, I don't see that these extra list/watch permissions are needed cilium upstream, maybe we need to investigate further. which cilium version you used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did/do see these errors in the cilium-agent pod.
The log messages I posted are from the cilium agent pod without list/watch permissions.
The version of cilium I am using is the current default for kubespray which is: 1.15.4

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove the list/watch secrets from the MR if this is delaying/blocking this MR.
Get secrets and ciliumloadbalancerippools are required by the agent though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I will so that. Thanks @cyclinder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the MR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @foobaar, Do you open this issue to cilium repo?

Copy link
Contributor Author

@foobaar foobaar Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not submit an issue for the current version of cilium 1.15.4

- [cilium](https://github.com/cilium/cilium) v1.15.4

To submit a bug report to cilium, i am required to be on atleast 1.15.7

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will upgrade and submit an issue to cilium this week

@yankay
Copy link
Member

yankay commented Sep 12, 2024

Hi @cyclinder

would you please help to review it

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 13, 2024
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 13, 2024
@tico88612
Copy link
Member

/retest-failed

1 similar comment
@tico88612
Copy link
Member

/retest-failed

@foobaar
Copy link
Contributor Author

foobaar commented Sep 18, 2024

Hey! Is there anything else you want from me for this PR?

@tico88612
Copy link
Member

/retest

Copy link
Contributor

@cyclinder cyclinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 24, 2024
@cyclinder
Copy link
Contributor

Please take a look, thanks :)

/cc @yankay

@yankay
Copy link
Member

yankay commented Sep 29, 2024

Thanks @foobaar
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cyclinder, foobaar, tico88612, yankay

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 29, 2024
@k8s-ci-robot k8s-ci-robot merged commit 860c15c into kubernetes-sigs:master Sep 29, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants