Skip to content

Add scorer based on active loras #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 90 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
0c95c2a
Add initial support for scorers, used as part of decision which pod w…
mayabar Apr 10, 2025
bde57da
Fixes in scores infrastructure & session aware scorer
mayabar Apr 14, 2025
8f9785e
- Add cleanup for session->pod map
mayabar Apr 14, 2025
52af66f
Export score and pod for external implementations
mayabar Apr 14, 2025
90e23bc
Rename session id header
mayabar Apr 15, 2025
c946acc
Separate code of Scorer interface and scorer implementations + add sc…
mayabar Apr 15, 2025
137ca09
Remove vllmRequest from scoreTargets API since it exists in the context
mayabar Apr 15, 2025
c42f72a
Support negative score weights
mayabar Apr 15, 2025
a05a573
Fix fakeDataStore to be compatible with DataStore intereface
mayabar Apr 15, 2025
aca8e07
- Check for nils in list of available pods in main scoring function o…
mayabar Apr 16, 2025
170b1a3
Active loras scorer added
mayabar Apr 16, 2025
bae2a66
[version bump] Promote 0.0.2 to prod, bump dev to 0.0.3
Apr 16, 2025
c398f4b
chore: move openshift router deployment to extra
shaneutt Apr 16, 2025
95e6bb1
feat: add deployment for sail operator
shaneutt Apr 16, 2025
6d12b06
feat: add istio control-plane deployment
shaneutt Apr 16, 2025
8c4eb46
feat: add vllm simulator deployment
shaneutt Apr 16, 2025
58ef159
feat: add inference-gateway deployment
shaneutt Apr 16, 2025
f606e0d
feat: add kind environment deployment
shaneutt Apr 16, 2025
c679724
feat: kind dev env deployment script
shaneutt Apr 16, 2025
a426e55
Bump golang.org/x/net from 0.37.0 to 0.38.0
dependabot[bot] Apr 16, 2025
09e79e6
Merge pull request #5 from neuralmagic/dependabot/go_modules/golang.o…
clubanderson Apr 16, 2025
d8303a0
Merge pull request #1 from mayabar/main
mayabar Apr 17, 2025
dad8db2
Merge pull request #4 from shaneutt/shaneutt/initial-dev-deployments
clubanderson Apr 17, 2025
f608526
fix: basic container image builds for linux
shaneutt Apr 17, 2025
2dd2ee7
fix: lint fix
shaneutt Apr 17, 2025
10be213
Merge pull request #10 from shaneutt/shaneutt/fix-image-builds
elevran Apr 17, 2025
a24a801
fix: move openshift deployment to environments
shaneutt Apr 17, 2025
950e07b
fix: retarget kustomize deployments in Makefile
shaneutt Apr 17, 2025
47bed9d
empty top level kustomization.yaml - make CICD happy
elevran Apr 17, 2025
0c4e6c8
Merge pull request #17 from elevran/deploy_kustomization_yaml
clubanderson Apr 17, 2025
e0dcba6
Merge pull request #15 from shaneutt/fix-kustomize-envs
clubanderson Apr 17, 2025
f36b10f
feat: add infra targets for ocp to Makefile
shaneutt Apr 17, 2025
b3feea1
Merge pull request #20 from shaneutt/makefile-openshift-infrastructur…
clubanderson Apr 17, 2025
320961c
Minor fixes to enable image building matching GIE
elevran Apr 17, 2025
1024f07
Yamls for inference model and inference pool
mayabar Apr 18, 2025
7655e3c
update vllm-sim deployment yaml
mayabar Apr 18, 2025
c9fd65b
draft changes in run-kind
elevran Apr 18, 2025
5e0ad98
testing new deployment dev
clubanderson Apr 18, 2025
9413ed0
testing new deployment dev
clubanderson Apr 18, 2025
e41f154
chore: inference-gateway component fixes for namespace-level deployment
shaneutt Apr 18, 2025
aad629b
Merge pull request #21 from elevran/image_build
shaneutt Apr 18, 2025
84da7a5
docs: add issue links for some TODOs
shaneutt Apr 18, 2025
be9c800
Merge pull request #29 from elevran/kind_env
shaneutt Apr 18, 2025
8004347
Merge pull request #24 from mayabar/dev
shaneutt Apr 18, 2025
6bd139f
feat: add crd deployment component
shaneutt Apr 18, 2025
9202462
fix: remove podman load instructions that are no longer needed
shaneutt Apr 18, 2025
8e46ea9
upgrade golang.org/x/oauth2 to v0.27.0
elevran Apr 18, 2025
e7a53af
Merge pull request #31 from shaneutt/shaneutt/crd-deployments
clubanderson Apr 18, 2025
142668c
Merge pull request #32 from elevran/oauth2_vuln
clubanderson Apr 18, 2025
413c4a7
feat: add istio crds to deployments
shaneutt Apr 18, 2025
43ab7c1
chore: add custom build for istio-control-plane
shaneutt Apr 18, 2025
a3355b3
chore: cleanup vllm-sim deployments
shaneutt Apr 18, 2025
f699eb7
chore: update gateway deployment for gie compat
shaneutt Apr 18, 2025
760414e
chore: kind env script cleanup
shaneutt Apr 18, 2025
5659f22
chore: cleanup sail operator deployment
shaneutt Apr 18, 2025
1ef859f
chore: cleanup kind dev env deployment
shaneutt Apr 18, 2025
322a421
chore: move kind dev env deploys
shaneutt Apr 18, 2025
383d2db
chore: move openshift dev env deploys
shaneutt Apr 18, 2025
55bf0f8
feat: add environment.dev.kind makefile target
shaneutt Apr 18, 2025
896270f
docs: add development documentation
shaneutt Apr 18, 2025
a83015b
chore: cleanup some language in the Makefile
shaneutt Apr 18, 2025
33a10b5
Merge pull request #33 from shaneutt/shaneutt/kind-full-stack
chcost Apr 18, 2025
175fbec
added infra pipeline run stuff
clubanderson Apr 19, 2025
b7bcf73
added infra pipeline run stuff
clubanderson Apr 19, 2025
eea72e4
added infra pipeline run stuff
clubanderson Apr 19, 2025
72a4328
pre-commit hook added
clubanderson Apr 19, 2025
04a0e25
test trivy scan
clubanderson Apr 21, 2025
5098332
Setup the Istio service to be a NodePort service and not a ClusterIP …
shmuelk Apr 21, 2025
84ff88c
Merge pull request #37 from oglok/istio-mem
oglok Apr 21, 2025
a7dbfa6
Merge pull request #38 from shmuelk/kind-nodeport
shmuelk Apr 21, 2025
c65cacb
chore: move install-hooks target into tools section
shaneutt Apr 21, 2025
64cc8fa
implementing vuln scanner and h100 cluster deployment
clubanderson Apr 21, 2025
5bb12ee
feat: add openshift-infra deployment w/ RBAC for dev envs
shaneutt Apr 21, 2025
785ee7a
Merge pull request #39 from shaneutt/rbac-for-openshift-env
clubanderson Apr 21, 2025
9df4be1
implementing vuln scanner and h100 cluster deployment
clubanderson Apr 21, 2025
d187ce7
chore: remove openshift dev environment
shaneutt Apr 21, 2025
2e3c9c8
feat: add kubernetes-infra deployment
shaneutt Apr 21, 2025
c3732ef
feat: add kubernetes dev deployment
shaneutt Apr 21, 2025
a6ca97a
fix: remove hardcoded namespace in makefile
shaneutt Apr 21, 2025
aa5e24b
chore: remove old dev env targets in makefile
shaneutt Apr 21, 2025
6c1c7d0
chore: cleanup kind dev env in makefile
shaneutt Apr 21, 2025
214f649
feat: add kubernetes dev env makefile targets
shaneutt Apr 21, 2025
6bedb47
docs: update DEVELOPMENT.md with k8s dev env
shaneutt Apr 21, 2025
f4f14bc
Merge pull request #40 from shaneutt/kubernetes-dev-env
Gregory-Pereira Apr 21, 2025
893541a
docs: remove WIPs from dev envs that are now complete
shaneutt Apr 22, 2025
d277cab
docs: language cleanup and updates for DEVELOPMENT.md
shaneutt Apr 22, 2025
f9afbfd
implementing vuln scanner and h100 cluster deployment
clubanderson Apr 22, 2025
ff21097
merge dev to lora-scorer branch
mayabar Apr 22, 2025
ed6c239
Add active lora scorer + remove loraAffinityThreshold filter
mayabar Apr 22, 2025
bfb1d76
fix lint problems
mayabar Apr 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 26 additions & 12 deletions .tekton/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
## 🛠️ CI/CD Pipeline Overview – Your Project

This pipeline is designed to support safe, efficient, and traceable development and deployment workflows using OpenShift Pipelines-as-Code, GitHub, and Quay.io.
<!-- NOTE TO CONTRIBUTORS: every repo in the hc4ai organization is intended to have the same contents in this file. The origin is the copy in https://github.ibm.com/mspreitz/hc4ai-hello-neural/blob/dev/.tekton/README.md; submit PRs against that one -->

This pipeline is designed to support safe, efficient, and traceable development and deployment workflows using [OpenShift Pipelines-as-Code](https://pipelinesascode.com/), [Tekton](https://tekton.dev/), [buildah](https://buildah.io/), GitHub, and Quay.io.

This pipeline is used for CI/CD of the `dev` and `main` branches. This pipeline runs from source through container image build to deployment and testing in the hc4ai cluster.

---

Expand All @@ -24,19 +28,28 @@ Each repo includes a `.version.json` file at its root. This file controls:

#### 🔑 Fields:
- **dev-version**: Current version of the dev branch. Used to tag dev images.
- **dev-registry**: Container registry location for development image pushes.
- **dev-registry**: Container repository location for development image pushes.
- **prod-version**: Managed by automation. Updated during promotion to match the dev-version.
- **prod-registry**: Container registry for production image pushes. The promoted dev image is re-tagged and pushed here.
- **prod-registry**: Container repository for production image pushes. The promoted dev image is re-tagged and pushed here.

The pipeline reads this file to:
- Extract the appropriate version tag
- Determine the correct registry for image pushes
- Determine the correct repository for image pushes
- Promote and tag dev images for prod

---

### Container Repositories

This pipeline maintains two container repositories for this GitHub repository, as follows.

- `quay.io/vllm-d/<repoName>-dev`. Hold builds from the `dev` branch as described below.
- `quay.io/vllm-d/<repoName>`. Holds promotions to prod, as described below.

---

### ⚙️ Pipeline Triggers
Triggered on `push` and `pull_request` events targeting the `dev` or `main` branches.
Triggered on `push` and `pull_request` events targeting the `dev` or `main` branches. The following workflows are the two behaviors of this pipeline.

### 🔧 dev Branch Workflow
1. Checkout repository
Expand All @@ -47,20 +60,20 @@ Triggered on `push` and `pull_request` events targeting the `dev` or `main` bran
- prod-version
- prod-registry
4. Build and push container image to:
→ `<dev-registry>:<dev-version>`
→ `<dev-repository>:<dev-version>`
5. Tag the Git commit using the `dev-version`
6. Optionally redeploy objects to OpenShift in `hc4ai-operator-dev`
6. Optionally redeploy objects to OpenShift in the `hc4ai-operator-dev` namespace.

✅ This process ensures that all code merged into dev is validated and deployed for testing.

### 🚀 main Branch Workflow
1. Checkout, lint, test, and parse `.version.json`
2. Skip image rebuild
3. Promote image by copying from:
→ `<dev-registry:<dev-version>` → `<prod-registry>:<prod-version>`
→ `<dev-repository:<dev-version>` → `<prod-repository>:<prod-version>`
4. Tag the Git commit using the `prod-version`
5. Update the upstream repo’s submodule to reference the new tag
6. Redeploy to OpenShift in `hc4ai-operator`
6. Redeploy to OpenShift in the `hc4ai-operator` namespace.

✅ No image rebuilds occur on main. Only validated dev images are promoted, ensuring reproducibility.

Expand All @@ -84,8 +97,8 @@ Tags are created using the configured Git credentials and pushed to the remote r

### ☸️ OpenShift Deployment
The pipeline includes automated deployment:
- On `dev`: Deploys to `hc4ai-operator-dev`
- On `main`: Deploys to `hc4ai-operator`
- On `dev`: Deploys to the `hc4ai-operator-dev` namespace. The Pod is named `<repoName>-major-minor`, using the `dev-version` from `.version.json`.
- On `main`: Deploys to `hc4ai-operator` namespace. The Pod is named `<repoName>-major-minor`, using the `prod-version` from `.version.json`.

Using `make uninstall-openshift` and `make install-openshift`, resources are cleanly reset.

Expand All @@ -112,6 +125,7 @@ After deployment, the pipeline:

### 🧠 Why `.version.json` Matters
- Decouples versioning from Git commit hashes
- Provides a single source of truth for version and registry info
- Provides a single source of truth for version and repository info
- Enables deterministic builds and controlled releases
- Simplifies debugging and auditing across environments

43 changes: 43 additions & 0 deletions .tekton/benchmark.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
apiVersion: tekton.dev/v1
kind: Task
metadata:
name: benchmark-task
spec:
params:
- name: openshift_host
description: "The OpenShift API server URL"
type: string
- name: openshift_namespace
description: "The OpenShift namespace to use"
type: string
steps:
- name: clone-and-install-fmperf
image: continuumio/miniconda3:latest
script: |
#!/bin/bash
set -ex

# Initialize conda (this sets up the environment for conda commands)
source /opt/conda/etc/profile.d/conda.sh

echo "Cloning fmperf repository..."
git clone https://github.com/fmperf-project/fmperf.git
cd fmperf

echo "Creating conda environment 'fmperf-env' with Python 3.11..."
conda create -y -n fmperf-env python=3.11

echo "Activating the conda environment..."
conda activate fmperf-env

echo "Installing required dependencies..."
pip install -r requirements.txt
pip install -e .

echo "Setting up environment variables for OpenShift connection..."
export OPENSHIFT_HOST="$(params.openshift_host)"
export OPENSHIFT_TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
export OPENSHIFT_NAMESPACE="$(params.openshift_namespace)"

echo "Running fmperf benchmark..."
python examples/example_vllm.py || true
4 changes: 4 additions & 0 deletions .tekton/buildah-build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ spec:
description: "Application version"
- name: image_tag_base
description: "Image tag base"
results:
- name: image-url
description: "The full image URL including tag"
workspaces:
- name: source
- name: registry
Expand Down Expand Up @@ -65,3 +68,4 @@ spec:
echo "🚀 Calling make buildah-build with IMG=$IMG..."
make buildah-build IMG=$IMG

echo "$IMG" > /tekton/results/image-url
88 changes: 88 additions & 0 deletions .tekton/deploy-infra-to-openshift.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
apiVersion: tekton.dev/v1
kind: Task
metadata:
name: openshift-redeploy-infra-task
spec:
params:
- name: source-branch
type: string
description: "Git branch name"
- name: prod-version
type: string
- name: dev-version
type: string
- name: prod_image_tag_base
type: string
- name: dev_image_tag_base
type: string
workspaces:
- name: source
steps:
- name: redeploy
image: quay.io/projectquay/golang:1.24
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
workingDir: $(workspaces.source.path)
env:
- name: STORAGE_DRIVER
value: vfs
script: |
#!/bin/bash
set -e

echo "📦 Installing dependencies with dnf..."
dnf install -y make jq curl gettext && dnf clean all

echo "📥 Installing kubectl..."
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

echo "📥 Installing kustomize..."
KUSTOMIZE_TAG=$(curl -s https://api.github.com/repos/kubernetes-sigs/kustomize/releases/latest | jq -r '.tag_name')
KUSTOMIZE_VERSION="${KUSTOMIZE_TAG##*/}" # strips prefix like 'kustomize/' from tag

curl -LO "https://github.com/kubernetes-sigs/kustomize/releases/download/${KUSTOMIZE_TAG}/kustomize_${KUSTOMIZE_VERSION}_linux_amd64.tar.gz"

tar -xzf "kustomize_${KUSTOMIZE_VERSION}_linux_amd64.tar.gz" -C /usr/local/bin
chmod +x /usr/local/bin/kustomize
kustomize version

echo "🔧 Getting namespace and project_name from Makefile..."
DEFAULT_NAMESPACE=$(make -s print-namespace)
PROJECT_NAME=$(make -s print-project-name)

if [ "$(params.source-branch)" = "main" ]; then
NS="${DEFAULT_NAMESPACE}"
IMAGE_TAG_BASE=$(params.prod_image_tag_base)
VERSION=$(params.prod-version)
else
NS="${DEFAULT_NAMESPACE}-dev"
IMAGE_TAG_BASE=$(params.dev_image_tag_base)
VERSION=$(params.dev-version)
fi

echo "🔧 Using namespace: $NS"
echo "🔧 Using project_name: $PROJECT_NAME"
echo "🔧 Using image_tag_base: $IMAGE_TAG_BASE"
echo "🔧 Using version: $VERSION"

# echo "🧹 Uninstalling existing deployment..."
# # make uninstall-openshift NAMESPACE=$NS PROJECT_NAME=$PROJECT_NAME IMAGE_TAG_BASE=$IMAGE_TAG_BASE VERSION=$VERSION || echo "❗️ Failed to uninstall deployment"

# (make uninstall || echo "❗️ Failed to uninstall") && (make undeploy IMG=$IMAGE_TAG_BASE:$VERSION || echo "❗️ Failed to uninstall deployment")

# echo "⏳ Waiting 3 seconds before reinstall..."
# sleep 3

echo "🚀 Reinstalling OpenShift deployment..."
INFRASTRUCTURE_OVERRIDE=true make install-openshift-infrastructure

echo "⏳ Waiting 20 seconds before verifying resources..."
sleep 20

echo "🔍 Checking status of resources in namespace: $NS"
kubectl get pods -n $NS || echo "❗️ Failed to get pods"
kubectl get deploy -n $NS || echo "❗️ Failed to get deployments"
kubectl get svc -n $NS || echo "❗️ Failed to get services"
kubectl get routes -n $NS || echo "❗️ Failed to get routes"
Loading