Skip to content

Commit 31deb1b

Browse files
adriangbhyperlint-ai[bot]Kludex
authored
PYD-1550: add section on OTEL Collector (#877)
Co-authored-by: hyperlint-ai[bot] <154288675+hyperlint-ai[bot]@users.noreply.github.com> Co-authored-by: Marcelo Trylesinski <[email protected]>
1 parent a45d51a commit 31deb1b

File tree

2 files changed

+284
-0
lines changed

2 files changed

+284
-0
lines changed

docs/how-to-guides/otel-collector.md

Lines changed: 283 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,283 @@
1+
# OpenTelemetry Collector
2+
3+
The OpenTelemetry Collector is a powerful tool that can be used to collect, process, and export telemetry data from various sources.
4+
It is designed to work with a wide range of data sources and can be easily configured to meet your specific needs.
5+
It can be run in a multitude of topologies, including as a standalone service, as a sidecar in a container, or as an agent on a host.
6+
7+
Although it is very powerful and versatile the Collector is also an advanced tool that is not required to use Logfire.
8+
If you don't need any of the Collectors features it is perfectly reasonable to send data from the Logfire SDK directly to our backend, and this is the default configuration for our SDK.
9+
10+
Use cases for the OpenTelemetry Collector include:
11+
12+
- **Centralized configuration**: keep Logfire credentials in a single place. Configure exporting to multiple backends (e.g. Logfire and audit logging) in a single place. All with the ability to update the configuration without needing to make changes to applications.
13+
- **Data transformation**: transform data before sending it to Logfire. For example, you can use the OpenTelemetry Collector to filter out sensitive information, extract structured data from logs or otherwise modify the data before sending it to Logfire.
14+
- **Data enrichment**: add additional context to your logs before sending them to Logfire. For example, you can use the OpenTelemetry Collector to add information about the host or container where the log was generated.
15+
- **Collecting existing data sources**: the Collector can be used to collect system logs (e.g. Kubernetes logs) or metrics from other formats. For example, you can use it to collect container logs from Kubernetes and scrape Prometheus metrics.
16+
17+
As Logfire is a fully compliant OpenTelemetry SDK and backend it does not require any special configuration to be used with the OpenTelemetry collector.
18+
Below we include a couple of examples for using the OpenTelemetry collector, assuming the deployment is being done on Kubernetes, but you can deploy the collector in any system, see the [official documentation](https://opentelemetry.io/docs/collector/deployment/) for more information.
19+
20+
This documentation does not attempt to be a complete guide to the OpenTelemetry collector, but rather a gentle introduction along with some key examples.
21+
For more information on the collector please see the [official documentation](https://opentelemetry.io/docs/collector/).
22+
23+
## Collecting system logs
24+
25+
This example shows how you can use the OpenTelemetry collector to collect systems logs (logs on stdoutt/stderr) from Kubernetes and send them to Logfire.
26+
This may be useful as part of a migration to Logfire if you aren't able to immediately edit all of the applications to install the Logfire SDK, although the data you receive won't be as rich as it would be from tracing with the Logfire SDK.
27+
28+
This relatively simple example is enough in many cases to replace existing systems like ElasticSearch, Loki or Splunk.
29+
30+
To follow this guide you'll need to have a local Kubernetes cluster running.
31+
There are many options for this including [Docker Desktop](https://www.docker.com/blog/how-to-set-up-a-kubernetes-cluster-on-docker-desktop/), [Rancher Desktop](https://docs.rancherdesktop.io/), [Minikube](https://minikube.sigs.k8s.io/docs/start/?arch=%2Fmacos%2Farm64%2Fstable%2Fbinary+download), [Kind](https://kind.sigs.k8s.io/), [k3s](https://docs.k3s.io/quick-start).
32+
33+
We'll first create an application via `apps.yaml` that emits some structured and unstructured logs to stdout/stderr:
34+
35+
```yaml title="apps.yaml"
36+
apiVersion: apps/v1
37+
kind: Deployment
38+
metadata:
39+
name: plain-app
40+
namespace: default
41+
labels:
42+
app: plain-app
43+
spec:
44+
replicas: 1
45+
selector:
46+
matchLabels:
47+
app: plain-app
48+
template:
49+
metadata:
50+
labels:
51+
app: plain-app
52+
spec:
53+
terminationGracePeriodSeconds: 1
54+
containers:
55+
- name: plain-app
56+
image: busybox
57+
command: ["sh", "-c", "while true; do echo 'Hello World'; sleep 1; done"]
58+
resources:
59+
limits:
60+
memory: "64Mi"
61+
cpu: "500m"
62+
requests:
63+
memory: "64Mi"
64+
cpu: "500m"
65+
---
66+
apiVersion: apps/v1
67+
kind: Deployment
68+
metadata:
69+
name: json-app
70+
namespace: default
71+
labels:
72+
app: json-app
73+
spec:
74+
replicas: 1
75+
selector:
76+
matchLabels:
77+
app: json-app
78+
template:
79+
metadata:
80+
labels:
81+
app: json-app
82+
spec:
83+
terminationGracePeriodSeconds: 1
84+
containers:
85+
- name: json-app
86+
image: busybox
87+
command:
88+
- "sh"
89+
- "-c"
90+
- |
91+
while true; do
92+
now=$(date -u '+%Y-%m-%dT%H:%M:%SZ')
93+
echo "{\"message\":\"Hello world!\",\"level\":\"warn\",\"timestamp\":\"$now\"}"
94+
sleep 1
95+
done
96+
resources:
97+
limits:
98+
memory: "64Mi"
99+
cpu: "500m"
100+
requests:
101+
memory: "64Mi"
102+
cpu: "500m"
103+
```
104+
105+
Deploy this application via `kubectl apply -f apps.yaml`.
106+
107+
Now we will set up a collector that can scrape logs from these apps, process them and send them to logfire.
108+
109+
We'll need to store Logfire credentials somewhere, a Kubernetes Secret is a reasonable choice, a better choice for a production environment would be to use [External Secrets Operator](https://external-secrets.io/latest/).
110+
111+
First create a Logfire write token, see [Create Write Tokens](./create-write-tokens.md).
112+
113+
Now to save it as a secret in Kubernetes run the following command, replacing `your-write-token` with the value of the write token you just created:
114+
115+
```shell
116+
kubectl create secret generic logfire-token --from-literal=logfire-token=your-write-token
117+
```
118+
119+
Note that this is equivalent to the following `secrets.yaml` file, but using `kubectl` is easier because it will base64 encode the secret for you.
120+
121+
```yaml title="secrets.yaml"
122+
apiVersion: v1
123+
kind: Secret
124+
metadata:
125+
name: logfire-token
126+
type: Opaque
127+
data:
128+
logfire-token: base64-encoded-logfire-token
129+
```
130+
131+
For the OTel Collector to scrape logs it will need permissions into the Kubernetes API which Kubernetes does not give out by default (you wouldn't want random pods being able to see logs from other pods by default!).
132+
133+
To do this we'll create an `rbac.yaml` file with the following content:
134+
135+
```yaml title="rbac.yaml"
136+
apiVersion: v1
137+
kind: ServiceAccount
138+
metadata:
139+
name: otel-collector
140+
namespace: default
141+
---
142+
apiVersion: rbac.authorization.k8s.io/v1
143+
kind: ClusterRole
144+
metadata:
145+
name: otel-collector-role
146+
rules:
147+
- apiGroups: [""]
148+
resources: ["pods", "namespaces"]
149+
verbs: ["get", "list", "watch"]
150+
- apiGroups: ["apps"]
151+
resources: ["replicasets"]
152+
verbs: ["get", "list", "watch"]
153+
---
154+
apiVersion: rbac.authorization.k8s.io/v1
155+
kind: ClusterRoleBinding
156+
metadata:
157+
name: otel-collector-rolebinding
158+
roleRef:
159+
apiGroup: rbac.authorization.k8s.io
160+
kind: ClusterRole
161+
name: otel-collector-role
162+
subjects:
163+
- kind: ServiceAccount
164+
name: otel-collector
165+
namespace: default
166+
```
167+
168+
Apply this configuration via `kubectl apply -f rbac.yaml`.
169+
170+
Now we can create the deployment for the collector itself.
171+
There are [several options for deploying the OTel collector](https://opentelemetry.io/docs/platforms/kubernetes/collector/components) including:
172+
173+
- As a sidecar container on each / some pods. This requires less permissions but implies manual configuration of each deployment with a sidecar. This option may work well if you want to bolt on **Logfire** to specific existing applications you control without modifying the application itself or deploying the collector cluster wide.
174+
- As a DaemonSet, this will deploy the collector on every node in the cluster. This is a good option if you want to collect logs from all pods in the cluster without modifying each deployment. Additionally DaemonSets can collect certain information that is not available to sidecars or services. This is the option we will use in this guide.
175+
- As a Service/Gateway, this option that allows you to deploy the collector as a standalone Kubernetes service.
176+
177+
Create a `collector.yaml` file with the following content:
178+
179+
```yaml title="collector.yaml"
180+
apiVersion: v1
181+
kind: ConfigMap
182+
metadata:
183+
name: otel-collector-config
184+
data:
185+
config.yaml: |-
186+
receivers:
187+
filelog:
188+
include_file_path: true
189+
include:
190+
- /var/log/pods/*/*/*.log
191+
exclude:
192+
# Exclude logs from all containers named otel-collector
193+
- /var/log/pods/*/otel-collector/*.log
194+
operators:
195+
- id: container-parser
196+
type: container
197+
- id: json_parser
198+
type: json_parser
199+
if: 'hasPrefix(body, "{\"")'
200+
parse_from: body
201+
parse_to: attributes
202+
parse_ints: true
203+
timestamp:
204+
parse_from: attributes.timestamp
205+
layout_type: strptime
206+
layout: "%Y-%m-%dT%H:%M:%S.%f%z"
207+
severity:
208+
parse_from: attributes.level
209+
overwrite_text: true
210+
exporters:
211+
debug:
212+
otlphttp:
213+
endpoint: "https://logfire-api.pydantic.info"
214+
headers:
215+
Authorization: "Bearer ${env:LOGFIRE_TOKEN}"
216+
service:
217+
pipelines:
218+
logs:
219+
receivers: [filelog]
220+
exporters: [debug, otlphttp]
221+
---
222+
apiVersion: apps/v1
223+
kind: DaemonSet
224+
metadata:
225+
name: otel-collector
226+
labels:
227+
app: opentelemetry
228+
component: otel-collector
229+
spec:
230+
selector:
231+
matchLabels:
232+
app: opentelemetry
233+
component: otel-collector
234+
template:
235+
metadata:
236+
labels:
237+
app: opentelemetry
238+
component: otel-collector
239+
spec:
240+
serviceAccountName: otel-collector
241+
terminationGracePeriodSeconds: 1
242+
containers:
243+
- name: otel-collector
244+
image: otel/opentelemetry-collector-contrib:0.119.0
245+
env:
246+
- name: LOGFIRE_TOKEN
247+
valueFrom:
248+
secretKeyRef:
249+
name: logfire-token
250+
key: logfire-token
251+
resources:
252+
limits:
253+
cpu: 100m
254+
memory: 200Mi
255+
requests:
256+
cpu: 100m
257+
memory: 200Mi
258+
volumeMounts:
259+
- mountPath: /var/log
260+
name: varlog
261+
readOnly: true
262+
- mountPath: /var/lib/docker/containers
263+
name: varlibdockercontainers
264+
readOnly: true
265+
- mountPath: /etc/otelcol-contrib/config.yaml
266+
name: data
267+
subPath: config.yaml
268+
readOnly: true
269+
volumes:
270+
- name: varlog
271+
hostPath:
272+
path: /var/log
273+
- name: varlibdockercontainers
274+
hostPath:
275+
path: /var/lib/docker/containers
276+
- name: data
277+
configMap:
278+
name: otel-collector-config
279+
```
280+
281+
Apply this configuration via `kubectl apply -f otel-collector.yaml`.
282+
283+
You should now see logs from the `plain-app` and `json-app` in your Logfire dashboard!

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,7 @@ nav:
9595
- Trace across Multiple Services: how-to-guides/distributed-tracing.md
9696
- Detect Service is Down: how-to-guides/detect-service-is-down.md
9797
- Suppress Spans and Metrics: how-to-guides/suppress.md
98+
- OpenTelemetry Collector: how-to-guides/otel-collector.md
9899
- Integrations:
99100
- Integrations: integrations/index.md
100101
- LLMs:

0 commit comments

Comments
 (0)