Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions infra/helm/bud/templates/budgateway.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,10 @@ spec:
ports:
- containerPort: 3000
env:
- name: NODE_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: TENSORZERO_REDIS_URL
value: "redis://default:{{ .Values.valkey.auth.password}}@{{ .Release.Name }}-valkey-primary:6379/2"
- name: TENSORZERO_CLICKHOUSE_URL
Expand Down
23 changes: 17 additions & 6 deletions infra/helm/bud/templates/extra/otel-collector.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,14 +88,13 @@ data:
address: 0.0.0.0:8888
---
apiVersion: apps/v1
kind: Deployment
kind: DaemonSet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

With the collector now running as a DaemonSet and services configured to send telemetry to the collector on the same node via $(NODE_IP), the ClusterIP service for otel-collector defined later in this file (lines 178-201) appears to be redundant. If this service is no longer used by any component, consider removing it to simplify the configuration.

metadata:
name: {{ .Release.Name }}-otel-collector
labels:
app: {{ .Release.Name }}-otel-collector
component: observability
spec:
replicas: {{ .Values.otelCollector.replicas | default 1 }}
selector:
matchLabels:
app: {{ .Release.Name }}-otel-collector
Expand All @@ -116,6 +115,16 @@ spec:
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
{{- with .Values.otelCollector.tolerations }}
{{- toYaml . | nindent 6 }}
{{- end }}
containers:
- name: otel-collector
image: {{ .Values.otelCollector.image.repository }}:{{ .Values.otelCollector.image.tag }}
Expand All @@ -126,9 +135,11 @@ spec:
ports:
- name: otlp-grpc
containerPort: 4317
hostPort: 4317
protocol: TCP
- name: otlp-http
containerPort: 4318
hostPort: 4318
Comment on lines 137 to +142
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Using hostPort exposes the OpenTelemetry collector ports on the node's IP address. If the nodes have publicly routable IP addresses, this could expose the collector to the internet, which can be a security risk. It is highly recommended to use NetworkPolicies to restrict access to these ports, allowing traffic only from within the cluster or trusted sources.

protocol: TCP
- name: metrics
containerPort: 8888
Expand All @@ -141,11 +152,11 @@ spec:
value: "80"
resources:
limits:
memory: {{ .Values.otelCollector.resources.limits.memory | default "2Gi" }}
cpu: {{ .Values.otelCollector.resources.limits.cpu | default "1000m" }}
memory: {{ .Values.otelCollector.resources.limits.memory | default "512Mi" }}
cpu: {{ .Values.otelCollector.resources.limits.cpu | default "500m" }}
requests:
memory: {{ .Values.otelCollector.resources.requests.memory | default "512Mi" }}
cpu: {{ .Values.otelCollector.resources.requests.cpu | default "100m" }}
memory: {{ .Values.otelCollector.resources.requests.memory | default "128Mi" }}
cpu: {{ .Values.otelCollector.resources.requests.cpu | default "50m" }}
livenessProbe:
httpGet:
path: /
Expand Down
4 changes: 4 additions & 0 deletions infra/helm/bud/templates/microservices/budcluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ spec:
ports:
- containerPort: 3003
env:
- name: NODE_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: BUD_SERVE_URL
value: {{ include "bud.ingress.url.budapp" $ }}
- name: NOTIFY_SERVICE_NAME
Expand Down
4 changes: 4 additions & 0 deletions infra/helm/bud/templates/microservices/budmetrics.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@ spec:
ports:
- containerPort: 3005
env:
- name: NODE_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
Comment on lines +46 to +49
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The NODE_IP environment variable is being injected into the budmetrics container, but it doesn't appear to be used. The values.yaml file does not define an OTEL_..._ENDPOINT variable for the budmetrics service that would consume this NODE_IP. Please verify if budmetrics is intended to send telemetry to the OpenTelemetry collector and update values.yaml accordingly. If it's not needed, this environment variable injection should be removed.

- name: PSQL_HOST
value: {{ include "bud.externalServices.postgresql.host" . }}
- name: PSQL_PORT
Expand Down
4 changes: 4 additions & 0 deletions infra/helm/bud/templates/microservices/budprompt.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,10 @@ spec:
ports:
- containerPort: 3015
env:
- name: NODE_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
# - name: BUD_GATEWAY_BASE_URL
# value: {{ include "bud.ingress.url.budgateway" $ }}
- name: PSQL_HOST
Expand Down
10 changes: 6 additions & 4 deletions infra/helm/bud/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,8 @@ microservices:
enabled: true
image: budstudio/budgateway:0.4.8
env:
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "http://{{ $.Release.Name }}-otel-collector:4317"
# Uses NODE_IP injected via Kubernetes Downward API in template
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "http://$(NODE_IP):4317"
nodeSelector: {}
affinity: {}
# Autoscaling configuration for budgateway
Expand Down Expand Up @@ -252,8 +253,8 @@ microservices:
METRICS_COLLECTION_ENABLED: "true"
METRICS_COLLECTION_TIMEOUT: "30"
METRICS_BATCH_SIZE: "20000"
# OpenTelemetry Collector configuration
OTEL_COLLECTOR_ENDPOINT: "http://{{ $.Release.Name }}-otel-collector:4318"
# OpenTelemetry Collector configuration - Uses NODE_IP injected via Kubernetes Downward API
OTEL_COLLECTOR_ENDPOINT: "http://$(NODE_IP):4318"
# Prometheus configuration
PROMETHEUS_SERVICE_NAME: "bud-metrics-kube-prometheu-prometheus"
PROMETHEUS_NAMESPACE: "bud-system"
Expand Down Expand Up @@ -333,7 +334,8 @@ microservices:
REDIS_DB_INDEX: "10"
REDIS_PASSWORD: "{{ .Values.valkey.auth.password }}"
OTEL_SDK_DISABLED: "false"
OTEL_EXPORTER_OTLP_ENDPOINT: "http://{{ $.Release.Name }}-otel-collector:4318"
# Uses NODE_IP injected via Kubernetes Downward API in template
OTEL_EXPORTER_OTLP_ENDPOINT: "http://$(NODE_IP):4318"
nodeSelector: {}
affinity: {}
global:
Expand Down
Loading