Skip to content

Commit

Permalink
[chore][otel] Enable persistence in otel.yml (#5549)
Browse files Browse the repository at this point in the history
* chore: update config, docs

* fix: spelling mistake

* fix: go tidy

* remove comments

* fix: go tidy

* fix: add filestorage to extension

* Update changelog/fragments/1726572104-enable-persistence-by-default.yaml

Co-authored-by: Andrzej Stencel <[email protected]>

* chore: update readme

* chore: update readme

* go.mod

* chore: go.mod

* comments

* comments

* chore: update notice

* chore: go.sum and notice

* chore: go.sum and notice

* Update internal/pkg/otel/templates/README.md.tmpl

Co-authored-by: Craig MacKenzie <[email protected]>

* Update internal/pkg/otel/templates/README.md.tmpl

Co-authored-by: Craig MacKenzie <[email protected]>

* chore: address comments

* chore: use STATE_PATH

* chore: update readme

* fix: add missing import

---------

Co-authored-by: Andrzej Stencel <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
  • Loading branch information
3 people authored Oct 21, 2024
1 parent 283429a commit 49579ae
Show file tree
Hide file tree
Showing 5 changed files with 154 additions and 1 deletion.
32 changes: 32 additions & 0 deletions changelog/fragments/1726572104-enable-persistence-by-default.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Kind can be one of:
# - breaking-change: a change to previously-documented behavior
# - deprecation: functionality that is being removed in a later release
# - bug-fix: fixes a problem in a previous version
# - enhancement: extends functionality but does not break or fix existing behavior
# - feature: new functionality
# - known-issue: problems that we are aware of in a given version
# - security: impacts on the security of a product or a user’s deployment.
# - upgrade: important information for someone upgrading from a prior version
# - other: does not fit into any of the other categories
kind: feature

# Change summary; a 80ish characters long description of the change.
summary: Enable persistence in the configuration provided with our OTel Collector distribution.

# Long description; in case the summary is not enough to describe the change
# this field accommodate a description without length limits.
# NOTE: This field will be rendered only for breaking-change and known-issue kinds at the moment.
#description:

# Affected component; usually one of "elastic-agent", "fleet-server", "filebeat", "metricbeat", "auditbeat", "all", etc.
component: elastic-agent,otel

# PR URL; optional; the PR number that added the changeset.
# If not present is automatically filled by the tooling finding the PR where this changelog fragment has been added.
# NOTE: the tooling supports backports, so it's able to fill the original PR number instead of the backport PR number.
# Please provide it if you are adding a fragment for a different PR.
pr:

# Issue URL; optional; the GitHub issue related to this changeset (either closes or is part of).
# If not present is automatically filled by the tooling with the issue linked to the PR number.
#issue: https://github.com/owner/repo/1234
19 changes: 19 additions & 0 deletions internal/pkg/agent/cmd/otel.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ package cmd

import (
"context"
"os"

"github.com/spf13/cobra"
"github.com/spf13/pflag"
Expand All @@ -27,6 +28,9 @@ func newOtelCommandWithArgs(args []string, streams *cli.IOStreams) *cobra.Comman
if err != nil {
return err
}
if err := prepareEnv(); err != nil {
return err
}
return runCollector(cmd.Context(), cfgFiles)
},
PreRun: func(c *cobra.Command, args []string) {
Expand Down Expand Up @@ -78,3 +82,18 @@ func runCollector(cmdCtx context.Context, configFiles []string) error {

return otel.Run(ctx, stop, configFiles)
}

func prepareEnv() error {
if _, ok := os.LookupEnv("STATE_PATH"); !ok {
// STATE_PATH is not set. Set it to defaultStateDirectory because we do not want to use any of the paths, that are also used by Beats or Agent
// because a standalone OTel collector must be able to run alongside them without issue.

// The filestorage extension will handle directory creation since create_directory: true is set by default.
// If the user hasn’t specified the env:STATE_PATH in filestorage config, they may have opted for a custom path, and the extension will create the directory accordingly.
// In this case, setting env:STATE_PATH will have no effect.
if err := os.Setenv("STATE_PATH", defaultStateDirectory); err != nil {
return err
}
}
return nil
}
48 changes: 48 additions & 0 deletions internal/pkg/otel/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,51 @@ This section provides a summary of components included in the Elastic Distributi
|---|---|
| [signaltometricsconnector](https://github.com/elastic/opentelemetry-collector-components/blob/connector/signaltometricsconnector/v0.2.1/connector/signaltometricsconnector/README.md) | v0.2.1 |
| [spanmetricsconnector](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/connector/spanmetricsconnector/v0.111.0/connector/spanmetricsconnector/README.md) | v0.111.0 |
## Persistence in OpenTelemetry Collector

By default, the OpenTelemetry Collector is stateless, which means it doesn't store offsets on disk while reading files. As a result, if you restart the collector, it won't retain the last read offset, potentially leading to data duplication or loss. However, we have configured persistence in the settings provided with the Elastic Agent package.

To enable persistence for the `filelogreceiver`, we add the `file_storage` extension and activate it for `filelog`.
Execute `export STATE_PATH=/path/to/store/otel/offsets` and use the following configuration to enable persistence:

```yaml
receivers:
filelog/platformlogs:
include: [ /var/log/system.log ]
start_at: beginning
storage: file_storage/filelogreceiver
extensions:
file_storage/filelogreceiver:
directory: ${env:STATE_PATH}
create_directory: true
exporters:
...
processors:
...
service:
extensions: [file_storage]
pipelines:
logs/platformlogs:
receivers: [filelog/platformlogs]
processors: [...]
exporters: [...]
```
> [!WARNING]
Removing the storage key from the filelog section will disable persistence, which will lead to data duplication or loss when the collector restarts.
> [!IMPORTANT]
If you remove the `create_directory: true` option, you'll need to manually create a directory to store the data. You can ignore this option if the directory already exists.

### Persistence in standalone Docker mode

By default, when running Elastic Distribution for OpenTelemetry Collector in Docker, checkpoints are stored in `/usr/share/elastic-agent/otel_registry` by default. To ensure data persists across container restarts, you can use the following command:

```bash
docker run --rm -ti --entrypoint="elastic-agent" --mount type=bind,source=/path/on/host,target=/usr/share/elastic-agent/otel_registry docker.elastic.co/beats/elastic-agent:9.0.0-SNAPSHOT otel
```

### Known issues:
- You face following `failed to build extensions: failed to create extension "file_storage/filelogreceiver": mkdir ...: permission denied` error while running the otel mode
- Cause: This issue is likely because the user running the executable lacks sufficient permissions to create the directory.
- Resolution: You can either create the directory manually or specify a path with necessary permissions.
50 changes: 50 additions & 0 deletions internal/pkg/otel/templates/README.md.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -74,3 +74,53 @@ This section provides a summary of components included in the Elastic Distributi
| [{{ .Name }}]({{ .Link }}) | {{ .Version }} |
{{ end -}}
{{ end -}}


## Persistence in OpenTelemetry Collector

By default, the OpenTelemetry Collector is stateless, which means it doesn't store offsets on disk while reading files. As a result, if you restart the collector, it won't retain the last read offset, potentially leading to data duplication or loss. However, we have configured persistence in the settings provided with the Elastic Agent package.

To enable persistence for the `filelogreceiver`, we add the `file_storage` extension and activate it for `filelog`.
Execute `export STATE_PATH=/path/to/store/otel/offsets` and use the following configuration to enable persistence:

```yaml
receivers:
filelog/platformlogs:
include: [ /var/log/system.log ]
start_at: beginning
storage: file_storage/filelogreceiver
extensions:
file_storage/filelogreceiver:
directory: ${env:STATE_PATH}
create_directory: true
exporters:
...
processors:
...
service:
extensions: [file_storage]
pipelines:
logs/platformlogs:
receivers: [filelog/platformlogs]
processors: [...]
exporters: [...]
```

> [!WARNING]
Removing the storage key from the filelog section will disable persistence, which will lead to data duplication or loss when the collector restarts.

> [!IMPORTANT]
If you remove the `create_directory: true` option, you'll need to manually create a directory to store the data. You can ignore this option if the directory already exists.

### Persistence in standalone Docker mode

By default, when running Elastic Distribution for OpenTelemetry Collector in Docker, checkpoints are stored in `/usr/share/elastic-agent/otel_registry` by default. To ensure data persists across container restarts, you can use the following command:

```bash
docker run --rm -ti --entrypoint="elastic-agent" --mount type=bind,source=/path/on/host,target=/usr/share/elastic-agent/otel_registry docker.elastic.co/beats/elastic-agent:9.0.0-SNAPSHOT otel
```

### Known issues:
- You face following `failed to build extensions: failed to create extension "file_storage/filelogreceiver": mkdir ...: permission denied` error while running the otel mode
- Cause: This issue is likely because the user running the executable lacks sufficient permissions to create the directory.
- Resolution: You can either create the directory manually or specify a path with necessary permissions.
6 changes: 5 additions & 1 deletion otel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ receivers:
filelog:
include: [ /var/log/system.log ]
start_at: beginning
storage: file_storage/filelogreceiver

processors:
resource:
Expand All @@ -24,9 +25,12 @@ extensions:
health_check:
endpoint: "localhost:13133"
path: "/health/status"
file_storage/filelogreceiver:
create_directory: true
directory: ${env:STATE_PATH}

service:
extensions: [health_check, memory_limiter]
extensions: [health_check, memory_limiter, file_storage/filelogreceiver]
pipelines:
logs:
receivers: [filelog]
Expand Down

0 comments on commit 49579ae

Please sign in to comment.