-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOFTWARE-5963: Document Kuantifier #194
base: master
Are you sure you want to change the base?
SOFTWARE-5963: Document Kuantifier #194
Conversation
mwestphall
commented
Aug 19, 2024
- Document installing helm, using helm to install prometheus, kube-state-metrics, and kuantifier
- Document known constraints for kuantifier accounting
- single-container pods, jobs run via Jobs rather than deployments, mandatory CPU request
- Document required changes to values.yaml for configuring kuantifier for gratia output
@brianhlin @matyasselmeci looks like we never got this very old PR reviewed, any comments on it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I had a half-finished review here. Mostly LGTM, mostly nitpicks in-line
To report contributions to OSG made via Kubernetes, the [Kuantifier](kuantifier-github) helm chart can be installed | ||
into your cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To report contributions to OSG made via Kubernetes, the [Kuantifier](kuantifier-github) helm chart can be installed | |
into your cluster. | |
To report contributions to OSG made via Kubernetes, install the [Kuantifier](kuantifier-github) helm chart | |
on your cluster. |
We should be directive when writing technical docs: you generally want to stay away from "you may" and "you can". Using this sentence as an example, what alternatives do you have to Kuantifier to report k8s contributions to the OSG? There aren't any so we shouldn't present this as if they have an alternative!
@@ -0,0 +1,191 @@ | |||
title: Monitor Kubernetes Workloads with Kuantifier | |||
DateReviewed: 2024-08-16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DateReviewed: 2024-08-16 | |
DateReviewed: 2024-02-07 |
|
||
### Install the Helm command line tools | ||
|
||
Kuantifier itself, and several of its prerequisites, are installed via [helm chart](https://helm.sh/). The helm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kuantifier itself, and several of its prerequisites, are installed via [helm chart](https://helm.sh/). The helm | |
Kuantifier itself, and several of its prerequisites, are installed via [Helm chart](https://helm.sh/). The Helm |
Unless you're referring to the CLI helm
, I think it's a proper noun
command line tools are used to install helm charts against a running kubernetes cluster, and can be installed | ||
as follows: | ||
|
||
1. Download the latest [helm release](helm-release) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. Download the latest [helm release](helm-release) | |
1. Download the latest [Helm release](helm-release) |
I don't think helm-release
is defined anywhere?
1. Ensure that the namespace where your workload pods run is properly configured. | ||
|
||
- Kuantifier relies on the `spec.containers[].resources.requests.cpu` field in workload pods | ||
to determine processor count for GRACC reporting. Ensure a cpu request is set for pods in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to determine processor count for GRACC reporting. Ensure a cpu request is set for pods in | |
to determine processor count for GRACC reporting. Ensure a CPU request is set for pods in |
kubectl -n monitoring get configmap kuantifier-processor-config -o yaml | ||
|
||
|
||
If the helm chart artifacts are present as expected, run a test instance of the CronJob and inspect its output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the helm chart artifacts are present as expected, run a test instance of the CronJob and inspect its output. | |
If the Helm chart artifacts are present as expected, run a test instance of the CronJob and inspect its output. |
kubectl -n monitoring create job --from=cronjob/kuantifier-cronjob kuantifier-test-job | ||
kubectl -n monitoring get pod | grep kuantifier-test-job | ||
|
||
1. Inspect the logs from the processor initContainer, which queries prometheus to generate output records. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. Inspect the logs from the processor initContainer, which queries prometheus to generate output records. | |
1. Inspect the logs from the processor initContainer, which queries Prometheus to generate output records. |
:::console | ||
kubectl -n monitoring logs <test-job-pod-name> -c processor | ||
|
||
1. Inspect the logs from the gratia-output container, which sends the output records to GRACC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. Inspect the logs from the gratia-output container, which sends the output records to GRACC. | |
1. Inspect the logs from the `gratia-output` container, which sends the output records to GRACC. |
:::console | ||
kubectl -n monitoring logs <test-job-pod-name> -c gratia-output | ||
|
||
If both the processor initContainer and gratia-output container run to completion without error, the next step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If both the processor initContainer and gratia-output container run to completion without error, the next step | |
If both the processor initContainer and `gratia-output` container run to completion without error, the next step |
kubectl -n monitoring logs <test-job-pod-name> -c gratia-output | ||
|
||
If both the processor initContainer and gratia-output container run to completion without error, the next step | ||
is to confirm with a member of the OSG technology team that the results are visible in GRACC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just point them at the https://gracc.opensciencegrid.org/d/000000079/site-summary?orgId=1&var-site=Purdue%20Geddes&var-type=Batch&var-interval=$__auto_interval_interval dashboard and have them self service this. It may take a few hours for results to show up (Ashton/Derek would know for sure)
That being said, there are lots of places where this documentation can go wrong. Please add our standard Getting Help
section