Permission Denied Error attempting to start S3 Node Pod #211

AlecAttwood · 2024-06-19T06:46:57Z

/kind bug

When deploying the CSI S3 Driver to an EKS Cluster the node pod is failing to start. Specifically the s3-plugin container is in crash loop backoff due to issues mounting a volume.

What happened?
When deploying the mountpoint-s3-csi-driver helm chart, encountering this error:

Error:
ailed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/proc/1/mounts" to rootfs at "/host/proc/mounts": change mount propagation through procfd: mount /host/proc/mounts (via /proc/self/fd/6), flags: 0x44000: permission denied: unknown

Containers:
  s3-plugin:
    Container ID:  containerd://ac97f14f0d3493be036a3a45728b9738db6e62f1d080a8a6cf936b480f315740
    Image:         public.ecr.aws/mountpoint-s3-csi-driver/aws-mountpoint-s3-csi-driver:v1.6.0
    Image ID:      public.ecr.aws/mountpoint-s3-csi-driver/aws-mountpoint-s3-csi-driver@sha256:4479dd8e8b108ddf64da806c1d955f3c3fabcbc9ef8bbb85d299666ba8a4e4c1
    Port:          9808/TCP
    Host Port:     0/TCP
    Args:
      --endpoint=$(CSI_ENDPOINT)
      --logtostderr
      --v=4
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       StartError
      Message:      failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/proc/3512/mounts" to rootfs at "/host/proc/mounts": change mount propagation through procfd: mount /host/proc/mounts (via /proc/self/fd/6), flags: 0x44000: permission denied: unknown

What you expected to happen?

The Pods start correctly

How to reproduce it (as minimally and precisely as possible)?

Create a new EKS cluster and deploy the moutnpoint-s3-csi-driver helm chart

Anything else we need to know?:
Since this is a permissions issues with the s3 plugin trying to mount a volume on the host I assume it's due to my EKS setup. However I haven't found much available info into potential fixes. Usually issues with mounting specifically in /proc folders are easy, just don't mount to that path since it's locked down, but the s3-plugin mount paths cannot be changed via the values file.

mountpoint-s3-csi-driver/charts/aws-mountpoint-s3-csi-driver/templates/node.yaml

Line 122 in 2bbdaad

- name: proc-mounts

Looking for any potential fixes or things to try. Thanks

Environment

Kubernetes version (use kubectl version): v1.29.4-eks-036c24b
Driver version: 1.7.0
Using default values

The text was updated successfully, but these errors were encountered:

monthonk · 2024-06-20T14:16:49Z

Hi @AlecAttwood , what underlying operating system are you using for your hosts?

AlecAttwood · 2024-06-20T23:08:35Z

Hi @AlecAttwood , what underlying operating system are you using for your hosts?

Amazon linux

sh-4.2$ cat /etc/os-release
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"

AlecAttwood · 2024-07-02T01:58:15Z

@monthonk I've had a better look, pretty sure our nodes have SELinux policy rules which are blocking the S3 CSI Driver from mounting on the /proc folder. I tried multiple combinations of SELinux config in the charts values.yaml, which didn't help. And from the SELinux audit, changing the seLinux option in the values.yaml, it didn't look like it was applying the securityContext to the pod properly. Can you suggest anything else to try, or potential seLinux configs which might work on a more locked down node?

unexge · 2024-07-04T11:42:38Z

Hey @AlecAttwood I'm trying to reproduce the issue you're having.

I created a new EKS cluster:

$ eksctl create cluster -f mp-csi-testing-cluster-helm.yaml

mp-csi-testing-cluster-helm.yaml

```yaml
# EKS Cluster setup for installing Mountpoint CSI driver via Helm.
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: mp-csi-testing-cluster-helm
  region: eu-north-1

iam:
  withOIDC: true
  serviceAccounts:
    - metadata:
        name: s3-csi-driver-sa
        namespace: kube-system
      roleName: eks-s3-csi-driver-role
      roleOnly: true
      attachPolicy:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Action:
              - "s3:ListBucket"
            Resource: "arn:aws:s3:::YOUR-BUCKET-NAME"
          - Effect: Allow
            Action:
              - "s3:GetObject"
              - "s3:PutObject"
              - "s3:AbortMultipartUpload"
              - "s3:DeleteObject"
            Resource: "arn:aws:s3:::YOUR-BUCKET-NAME/*"

nodeGroups:
  - name: ng-1
    instanceType: m5.large
    desiredCapacity: 1
```

and installed Mountpoint CSI driver Helm chart:

$ helm install aws-mountpoint-s3-csi-driver aws-mountpoint-s3-csi-driver/aws-mountpoint-s3-csi-driver \
    --namespace kube-system

One thing might not be clear in our installation instructions if you're using IAM Roles for Service Accounts (IRSA) is that, it asks you to create the role only (i.e., without service account):

$ eksctl create iamserviceaccount \
    --name s3-csi-driver-sa \
    --namespace kube-system \
    --cluster $CLUSTER_NAME \
    --attach-policy-arn $ROLE_ARN \
    --approve \
    --role-name $ROLE_NAME \
    --region $REGION \
    --role-only # <-- Here

and our Helm chart creates service account named s3-csi-driver-sa for you, but in order for IRSA to work you need to annotate your service account with your Role ARN. So, you should have:

$ kubectl describe sa s3-csi-driver-sa -n kube-system | rg eks.amazonaws.com/role-arn
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::account:role/eks-s3-csi-driver-role

in s3-csi-driver-sa otherwise you might get permission denied, but that should be from Mountpoint while trying to do ListObjects or something and not should be at the container initialization phase.

I tried with both 1.7.0 (latest) and 1.6.0 versions of CSI driver and also used Amazon Linux as my node OS but couldn't reproduce the issue you're having.

Did you do some other configuration you think it might be related?

Btw, you should be able to override seLinuxOptions by passing --set node.seLinuxOptions.level="..." to Helm. You can see output of manifests without applying:

$ helm install aws-mountpoint-s3-csi-driver aws-mountpoint-s3-csi-driver/aws-mountpoint-s3-csi-driver \
    --namespace kube-system \
    --set node.seLinuxOptions.level="..." \
    --dry-run --debug

AlecAttwood · 2024-07-07T23:35:01Z

Hi @unexge,

Did you do some other configuration you think it might be related?

I mentioned above our nodes are hardened with extra CIS hardening, and have extra SELinux policies added. I'm pretty sure this is what causing the issues. The pods never actually start, and there's SELinux audit logs on our nodes which list denied actions when mounting on the /proc/mounts folder.

you should be able to override seLinuxOptions by passing --set node.seLinuxOptions.level="..." to Helm

I did this, then quired the security contexts for all the pods and didn't see anything. Every time I changed the seLinux options and re-deployed, I looked at the SELinux Audit logs and they were exactly the same. Implying that setting the values didn't change anything or didn't apply the security context. Which is weird, I'm not 100% sure if that an issue with the helm chart or with the logging.

I'll continue to investigate, I'm still convinced if I can set the right SELinux permissions it should work, even with our hardened AMI. Are there any other combinations of SELinux config to try? I'm not that familiar with it, I'm sure the default config should be enough, but it's not working in this case.

unexge · 2024-07-08T08:57:35Z

Hey @AlecAttwood,

I was able to reproduce the issue with /proc/mounts on AL2023 and looking into it.

Looking at the audit log I got from my host:

type=AVC msg=audit(1720166060.453:300): avc:  denied  { mounton } for  pid=2584 comm="runc:[2:INIT]" path="/run/containerd/io.containerd.runtime.v2.task/k8s.io/cd2c4eb5d338fb23f3d3e537bd948aede52f33a0defbc05a5cd19a5c4ef808a8/rootfs/host/proc/mounts" dev="proc" ino=20684 scontext=system_u:system_r:unconfined_service_t:s0 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=file permissive=1

Seems like it fails when runc/containerd tries to mount /proc/mounts which is before any of our container/pod runs (that's probably why when you change SELinux settings you don't see any difference). We probably need to change SELinux settings for runc/containerd (either via some configuration runc/containerd exposes similar to Kubernetes' seLinuxOptions or via SELinux transition policies – though I'm not an expert and not sure if it's possible) to allow them to mount on /proc/mounts.

I'm looking into whether is it possible to get rid of /proc/mounts mount altogether

AlecAttwood · 2024-07-09T01:27:15Z

Thanks for looking into it. That audit logs looks almost identical to what I was seeing. On my side we're going to add a temporary SELinux policy on our nodes to allow it to mount on /proc. I'll keep an eye on this issue, and then upgrade the chart when a fix is released. Thanks again.

DWS-guy · 2024-09-11T22:01:30Z

Any updates on this issue? I am experiencing a similar problem with mounting /proc/mounts inside a kind cluster, though I am not using SELinux

unexge · 2024-09-12T08:55:35Z

@DWS-guy no updates yet unfortunately. Are you getting mount /host/proc/mounts (via /proc/self/fd/6), flags: 0x44000: permission denied: unknown as well?

DWS-guy · 2024-09-12T13:08:15Z

@unexge Correct, I am getting that exact error. SELinux is not present on my system

dannycjones · 2024-11-06T15:51:41Z

Any updates on this issue? I am experiencing a similar problem with mounting /proc/mounts inside a kind cluster, though I am not using SELinux

We are hoping to remove the dependency on /proc/mount at some point, but I have nothing to share at this time.

I'd recommend opening a new bug report with your logs so that we can investigate, although I would note that we don't officially support kind clusters, only open-source Kubernetes (K8S) and Amazon EKS.

AlecAttwood changed the title ~~Permission Denided Error attemping to start S3 Node Pod~~ Permission Denied Error attempting to start S3 Node Pod Jun 19, 2024

unexge self-assigned this Jul 3, 2024

unexge added the bug Something isn't working label Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Permission Denied Error attempting to start S3 Node Pod #211

Permission Denied Error attempting to start S3 Node Pod #211

AlecAttwood commented Jun 19, 2024 •

edited

Loading

monthonk commented Jun 20, 2024

AlecAttwood commented Jun 20, 2024

AlecAttwood commented Jul 2, 2024 •

edited

Loading

unexge commented Jul 4, 2024

AlecAttwood commented Jul 7, 2024

unexge commented Jul 8, 2024

AlecAttwood commented Jul 9, 2024

DWS-guy commented Sep 11, 2024

unexge commented Sep 12, 2024

DWS-guy commented Sep 12, 2024 •

edited

Loading

dannycjones commented Nov 6, 2024

Permission Denied Error attempting to start S3 Node Pod #211

Permission Denied Error attempting to start S3 Node Pod #211

Comments

AlecAttwood commented Jun 19, 2024 • edited Loading

monthonk commented Jun 20, 2024

AlecAttwood commented Jun 20, 2024

AlecAttwood commented Jul 2, 2024 • edited Loading

unexge commented Jul 4, 2024

AlecAttwood commented Jul 7, 2024

unexge commented Jul 8, 2024

AlecAttwood commented Jul 9, 2024

DWS-guy commented Sep 11, 2024

unexge commented Sep 12, 2024

DWS-guy commented Sep 12, 2024 • edited Loading

dannycjones commented Nov 6, 2024

AlecAttwood commented Jun 19, 2024 •

edited

Loading

AlecAttwood commented Jul 2, 2024 •

edited

Loading

DWS-guy commented Sep 12, 2024 •

edited

Loading