Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOFTWARE-5963: Document Kuantifier #194

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mwestphall
Copy link
Collaborator

  • Document installing helm, using helm to install prometheus, kube-state-metrics, and kuantifier
  • Document known constraints for kuantifier accounting
    • single-container pods, jobs run via Jobs rather than deployments, mandatory CPU request
  • Document required changes to values.yaml for configuring kuantifier for gratia output

@mwestphall
Copy link
Collaborator Author

@brianhlin @matyasselmeci looks like we never got this very old PR reviewed, any comments on it?

Copy link
Contributor

@brianhlin brianhlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I had a half-finished review here. Mostly LGTM, mostly nitpicks in-line

Comment on lines +8 to +9
To report contributions to OSG made via Kubernetes, the [Kuantifier](kuantifier-github) helm chart can be installed
into your cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To report contributions to OSG made via Kubernetes, the [Kuantifier](kuantifier-github) helm chart can be installed
into your cluster.
To report contributions to OSG made via Kubernetes, install the [Kuantifier](kuantifier-github) helm chart
on your cluster.

We should be directive when writing technical docs: you generally want to stay away from "you may" and "you can". Using this sentence as an example, what alternatives do you have to Kuantifier to report k8s contributions to the OSG? There aren't any so we shouldn't present this as if they have an alternative!

@@ -0,0 +1,191 @@
title: Monitor Kubernetes Workloads with Kuantifier
DateReviewed: 2024-08-16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DateReviewed: 2024-08-16
DateReviewed: 2024-02-07


### Install the Helm command line tools

Kuantifier itself, and several of its prerequisites, are installed via [helm chart](https://helm.sh/). The helm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Kuantifier itself, and several of its prerequisites, are installed via [helm chart](https://helm.sh/). The helm
Kuantifier itself, and several of its prerequisites, are installed via [Helm chart](https://helm.sh/). The Helm

Unless you're referring to the CLI helm, I think it's a proper noun

command line tools are used to install helm charts against a running kubernetes cluster, and can be installed
as follows:

1. Download the latest [helm release](helm-release)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Download the latest [helm release](helm-release)
1. Download the latest [Helm release](helm-release)

I don't think helm-release is defined anywhere?

1. Ensure that the namespace where your workload pods run is properly configured.

- Kuantifier relies on the `spec.containers[].resources.requests.cpu` field in workload pods
to determine processor count for GRACC reporting. Ensure a cpu request is set for pods in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to determine processor count for GRACC reporting. Ensure a cpu request is set for pods in
to determine processor count for GRACC reporting. Ensure a CPU request is set for pods in

kubectl -n monitoring get configmap kuantifier-processor-config -o yaml


If the helm chart artifacts are present as expected, run a test instance of the CronJob and inspect its output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the helm chart artifacts are present as expected, run a test instance of the CronJob and inspect its output.
If the Helm chart artifacts are present as expected, run a test instance of the CronJob and inspect its output.

kubectl -n monitoring create job --from=cronjob/kuantifier-cronjob kuantifier-test-job
kubectl -n monitoring get pod | grep kuantifier-test-job

1. Inspect the logs from the processor initContainer, which queries prometheus to generate output records.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Inspect the logs from the processor initContainer, which queries prometheus to generate output records.
1. Inspect the logs from the processor initContainer, which queries Prometheus to generate output records.

:::console
kubectl -n monitoring logs <test-job-pod-name> -c processor

1. Inspect the logs from the gratia-output container, which sends the output records to GRACC.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Inspect the logs from the gratia-output container, which sends the output records to GRACC.
1. Inspect the logs from the `gratia-output` container, which sends the output records to GRACC.

:::console
kubectl -n monitoring logs <test-job-pod-name> -c gratia-output

If both the processor initContainer and gratia-output container run to completion without error, the next step
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If both the processor initContainer and gratia-output container run to completion without error, the next step
If both the processor initContainer and `gratia-output` container run to completion without error, the next step

kubectl -n monitoring logs <test-job-pod-name> -c gratia-output

If both the processor initContainer and gratia-output container run to completion without error, the next step
is to confirm with a member of the OSG technology team that the results are visible in GRACC.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just point them at the https://gracc.opensciencegrid.org/d/000000079/site-summary?orgId=1&var-site=Purdue%20Geddes&var-type=Batch&var-interval=$__auto_interval_interval dashboard and have them self service this. It may take a few hours for results to show up (Ashton/Derek would know for sure)

That being said, there are lots of places where this documentation can go wrong. Please add our standard Getting Help section

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants