Skip to content

Commit

Permalink
Add seaweedfs to contrib (#2826)
Browse files Browse the repository at this point in the history
* Add seaweedfs to contrib

Signed-off-by: Patrick Schönthaler <[email protected]>

* Review changes

Signed-off-by: Patrick Schönthaler <[email protected]>

* Add workflow for seaweedfs pipeline test

Signed-off-by: Patrick Schönthaler <[email protected]>

* Remove minio svc backup

Signed-off-by: Patrick Schönthaler <[email protected]>

* Create user in workflow

Signed-off-by: Patrick Schönthaler <[email protected]>

* Fix yamllint

Signed-off-by: Patrick Schönthaler <[email protected]>

* Use other action to create KinD cluster

Signed-off-by: Patrick Schönthaler <[email protected]>

* Rename new workflow

Signed-off-by: Patrick Schönthaler <[email protected]>

* Add namespace to kubectl commands

Signed-off-by: Patrick Schönthaler <[email protected]>

* Add missing networkpolicy for seaweedfs

Signed-off-by: Patrick Schönthaler <[email protected]>

* Add missing newlines at end of files

Signed-off-by: Patrick Schönthaler <[email protected]>

* Integrate changes from master

Signed-off-by: Patrick Schönthaler <[email protected]>

* Adjust PR trigger paths

Signed-off-by: Patrick Schönthaler <[email protected]>

---------

Signed-off-by: Patrick Schönthaler <[email protected]>
  • Loading branch information
pschoen-itsc authored Oct 2, 2024
1 parent a7c646e commit aee7929
Show file tree
Hide file tree
Showing 12 changed files with 343 additions and 0 deletions.
100 changes: 100 additions & 0 deletions .github/workflows/pipeline_swfs_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
name: Deploy and test Kubeflow Pipelines manifests with seaweedfs and m2m auth in KinD
on:
pull_request:
paths:
- tests/gh-actions/install_KinD_create_KinD_cluster_install_kustomize.sh
- .github/workflows/pipeline_swfs_test.yaml
- apps/pipeline/upstream/**
- tests/gh-actions/install_istio.sh
- tests/gh-actions/install_cert_manager.sh
- tests/gh-actions/install_oauth2-proxy.sh
- common/cert-manager/**
- common/oauth2-proxy/**
- common/istio*/**
- contrib/seaweedfs/**

jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Install KinD, Create KinD cluster and Install kustomize
run: ./tests/gh-actions/install_KinD_create_KinD_cluster_install_kustomize.sh

- name: Install kubectl
run: ./tests/gh-actions/install_kubectl.sh

- name: Install Istio
run: ./tests/gh-actions/install_istio.sh

- name: Install oauth2-proxy
run: ./tests/gh-actions/install_oauth2-proxy.sh

- name: Install cert-manager
run: ./tests/gh-actions/install_cert_manager.sh

- name: Create kubeflow namespace
run: kustomize build common/kubeflow-namespace/base | kubectl apply -f -

- name: Install KF Pipelines
run: ./tests/gh-actions/install_pipelines.sh

- name: Install KF Multi Tenancy
run: ./tests/gh-actions/install_multi_tenancy.sh

- name: Install kubeflow-istio-resources
run: kustomize build common/istio-1-22/kubeflow-istio-resources/base | kubectl apply -f -

- name: Create KF Profile
run: kustomize build common/user-namespace/base | kubectl apply -f -

- name: Install seaweedfs
run: |
kustomize build contrib/seaweedfs/istio | kubectl apply -f -
kubectl -n kubeflow wait --for=condition=available --timeout=600s deploy/seaweedfs
kubectl -n kubeflow exec deploy/seaweedfs -c seaweedfs -- sh -c "echo \"s3.configure -user minio -access_key minio -secret_key minio123 -actions Read,Write,List -apply\" | /usr/bin/weed shell"
- name: port forward
run: |
ingress_gateway_service=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}')
nohup kubectl port-forward --namespace istio-system svc/${ingress_gateway_service} 8080:80 &
while ! curl localhost:8080; do echo waiting for port-forwarding; sleep 1; done; echo port-forwarding ready
- name: List and deploy test pipeline with authorized ServiceAccount Token
run: |
pip3 install kfp==2.4.0
KF_PROFILE=kubeflow-user-example-com
TOKEN="$(kubectl -n $KF_PROFILE create token default-editor)"
python -c '
from time import sleep
import kfp
import sys
token = sys.argv[1]
namespace = sys.argv[2]
client = kfp.Client(host="http://localhost:8080/pipeline", existing_token=token)
pipeline = client.list_pipelines().pipelines[0]
pipeline_name = pipeline.display_name
pipeline_id = pipeline.pipeline_id
pipeline_version_id = client.list_pipeline_versions(pipeline_id).pipeline_versions[0].pipeline_version_id
experiment_id = client.create_experiment("seaweedfs-test", namespace=namespace).experiment_id
print(f"Starting pipeline {pipeline_name}.")
run_id = client.run_pipeline(experiment_id=experiment_id, job_name="m2m-test", pipeline_id=pipeline_id, version_id=pipeline_version_id).run_id
while True:
status = client.get_run(run_id=run_id).state
if status in ["PENDING", "RUNNING"]:
print(f"Waiting for run_id: {run_id}, status: {status}.")
sleep(10)
else:
print(f"Run with id {run_id} finished with status: {status}.")
if status != "SUCCEEDED":
print("Pipeline failed")
raise SystemExit(1)
break
' "${TOKEN}" "${KF_PROFILE}"
6 changes: 6 additions & 0 deletions contrib/seaweedfs/OWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
approvers:
# - pschoen-itsc
- juliusvonkohout
reviewers:
# - pschoen-itsc
- juliusvonkohout
51 changes: 51 additions & 0 deletions contrib/seaweedfs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# SeaweedFS

- [Official documentation](https://github.com/seaweedfs/seaweedfs/wiki)
- [Official repository](https://github.com/seaweedfs/seaweedfs)

SeaweedFS is a simple and highly scalable distributed file system. It has an S3 interface which makes it usable as an object store for kubeflow.

## Prerequisites

- Kubernetes (any recent Version should work)
- You should have `kubectl` available and configured to talk to the desired cluster.
- `kustomize`
- If you installed kubeflow with minio, use the `istio` dir instead of `base` for the kustomize commands.

## Compile manifests

```bash
kubectl kustomize ./base/
```

## Install SeaweedFS

**WARNING**
This replaces the service `minio-service` and will redirect the traffic to seaweedfs.

```bash
# Optional, but recommended to backup existing minio-service
kubectl get -n kubeflow svc minio-service -o=jsonpath='{.metadata.annotations.kubectl\.kubernetes\.io/last-applied-configuration}' > svc-minio-service-backup.json

kubectl kustomize ./base/ | kubectl apply -f -
```

## Verify deployment

Run
```bash
./test.sh
```
With the ready check on the container it already verifies that the S3 starts correctly.
You can then use it with the endpoint at http://localhost:8333.
To create access keys open a shell on the pod and use `weed shell` to configure your instance.
Create a user with the command `s3.configure -user <username> -access_key <access-key> -secret-key <secret-key> -actions Read:<my-bucket>/<my-prefix>,Write::<my-bucket>/<my-prefix> -apply`
Documentation for this can also be found [here](https://github.com/seaweedfs/seaweedfs/wiki/Amazon-S3-API).

## Uninstall SeaweedFS

```bash
kubectl kustomize ./base/ | kubectl delete -f -
# Restore minio-service from backup
kubectl apply -f svc-minio-service-backup.json
```
3 changes: 3 additions & 0 deletions contrib/seaweedfs/UPDGRADE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Upgrade SeaweedFS

Change the image tag in the Deployment to the desired version. You can find the available images [here](https://hub.docker.com/r/chrislusf/seaweedfs).
9 changes: 9 additions & 0 deletions contrib/seaweedfs/base/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: kubeflow

resources:
- seaweedfs-deployment.yaml
- seaweedfs-pvc.yaml
- seaweedfs-service.yaml
- seadweedfs-networkpolicy.yaml
28 changes: 28 additions & 0 deletions contrib/seaweedfs/base/seadweedfs-networkpolicy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: seaweedfs
spec:
ingress:
- from:
- namespaceSelector:
matchExpressions:
- key: app.kubernetes.io/part-of
operator: In
values:
- kubeflow-profile
- namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
- istio-system
- podSelector: {}
podSelector:
matchExpressions:
- key: app
operator: In
values:
- seaweedfs
policyTypes:
- Ingress
65 changes: 65 additions & 0 deletions contrib/seaweedfs/base/seaweedfs-deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: seaweedfs
namespace: kubeflow
labels:
app: seaweedfs
spec:
selector:
matchLabels:
app: seaweedfs
strategy:
type: Recreate
# Single container setup not scalable
replicas: 1
template:
metadata:
labels:
app: seaweedfs
spec:
containers:
- name: seaweedfs
image: 'chrislusf/seaweedfs:3.69'
args:
- 'server'
- '-dir=/data'
- '-s3'
ports:
- containerPort: 8333
readinessProbe:
httpGet:
path: /status
port: 8333
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 15
successThreshold: 1
failureThreshold: 100
timeoutSeconds: 10
securityContext: # Using restricted profile
allowPrivilegeEscalation: false
privileged: false
runAsNonRoot: true
# image defaults to root user
runAsUser: 1001
runAsGroup: 1001
seccompProfile:
type: RuntimeDefault
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
volumeMounts:
- mountPath: /data
name: data
resources:
# Benchmark this, just taken from minio
requests:
cpu: 20m
memory: 100Mi
volumes:
- name: data
persistentVolumeClaim:
claimName: seaweedfs-pvc
11 changes: 11 additions & 0 deletions contrib/seaweedfs/base/seaweedfs-pvc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: seaweedfs-pvc
namespace: kubeflow
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
13 changes: 13 additions & 0 deletions contrib/seaweedfs/base/seaweedfs-service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
apiVersion: v1
kind: Service
metadata:
name: minio-service
namespace: kubeflow
spec:
ports:
- name: http
port: 9000
protocol: TCP
targetPort: 8333
selector:
app: seaweedfs
30 changes: 30 additions & 0 deletions contrib/seaweedfs/istio/istio-authorization-policy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: seaweedfs-service
spec:
action: ALLOW
selector:
matchLabels:
app: seaweedfs
rules:
- from:
- source:
principals:
- cluster.local/ns/kubeflow/sa/ml-pipeline
- from:
- source:
principals:
- cluster.local/ns/kubeflow/sa/ml-pipeline-ui
# Allow traffic from User Pipeline Pods, which don't have a sidecar.
- {}
---
apiVersion: "networking.istio.io/v1alpha3"
kind: DestinationRule
metadata:
name: ml-pipeline-seaweedfs
spec:
host: minio-service.kubeflow.svc.cluster.local
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
7 changes: 7 additions & 0 deletions contrib/seaweedfs/istio/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: kubeflow

resources:
- ../base/
- istio-authorization-policy.yaml
20 changes: 20 additions & 0 deletions contrib/seaweedfs/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/usr/bin/env bash

set -xe

kubectl create ns kubeflow || echo "namespace kubeflow already exists"
kubectl get -n kubeflow svc minio-service -o=jsonpath='{.metadata.annotations.kubectl\.kubernetes\.io/last-applied-configuration}' > svc-minio-service-backup.json
kustomize build istio/ | kubectl apply --server-side -f -
kubectl -n kubeflow wait --for=condition=available --timeout=600s deploy/seaweedfs
kubectl -n kubeflow exec deployments/seaweedfs -c seaweedfs -- sh -c "echo \"s3.configure -user minio -access_key minio -secret_key minio123 -actions Read,Write,List -apply\" | /usr/bin/weed shell"

kubectl -n kubeflow port-forward svc/minio-service 8333:9000
echo "S3 endpoint available on localhost:8333" &

function trap_handler {
kubectl -n kubeflow logs -l app=seaweedfs --tail=100
kustomize build istio/ | kubectl delete -f -
kubectl apply -f svc-minio-service-backup.json
}

trap trap_handler EXIT

0 comments on commit aee7929

Please sign in to comment.