-
Notifications
You must be signed in to change notification settings - Fork 21
Add proposal for embedded exporters in OTel Collector #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ce1588a to
32bf5b2
Compare
Signed-off-by: Arthur Silva Sens <[email protected]>
32bf5b2 to
4caee17
Compare
|
It sounds like this proposed architecture would require that Prometheus exporters be implemented in Go and explicitly linked into the OTel Collector binary. We previously had solved this in the Collector with the I think it's a significant roadblock to require that exporters be written in Go, per-exporter code changes (to adopt the new |
|
Go's lack of a decent interface for runtime loading severely limits the options here. It's possible to combine https://pkg.go.dev/plugin with CGO and Rust based exporters are starting to appear and are likely to grow in popularity over time so more strongly tying exporter=go might not be ideal. |
AIUI, the Collector objects to loading code from a config-controlled path, whether that's as a plugin or I think there could be a really elegant child process solution here, where you run HTTP and/or gRPC over the subprocess's stdin to ship Prometheus and/or OTLP metrics back from the exporter to the Collector.
I'm not super familiar with the whole exporter ecosystem, but my impression was that there's already a decent roster of Python exporters as well. |
Interesting! This is the first time I hear about this component. I can totally understand why it was deprecated, and if I understand it correctly, it just executed binaries, right? You still needed to run the usual prometheusreceiver and configure it to scrape the executed binaries, is that correct? With the approach I'm suggesting here, I understand the limitations of only supporting exporters written in Go, but do you think the overall experience is better or worse than the previous attempt? I can see value in not spawning new subprocesses, and not requiring a scrape configuration also simplifies things in my point of view :)
I agree it's a roadblock if our goal is to enable ALL exporters independently of language. I was aiming for a slightly less ambitious project, to give a way forward to very widely used exporters like node_exporter, kube-state-metrics, blackbox_exporter, cadvisor, mysql_exporter, etc etc. With that said, I'd be super happy to listen to ideas that support even more languages and incorporate them into the proposal. My preference would continue something that we can embed as part of the Collector Pipeline, and not as separate processes that still require extra scrape configuration.
That sounds cool! Would you be willing to write down your ideas in a Google Doc? Or even a counter-proposal to this one? |
|
Just as a datapoint - majority of Prometheus exporters are written in Go. |
|
I don't understand why it was deprecated. :) AIUI the concern was "if the user can configure a command to run, then we will run the command, and that command could do anything". Of course that's true (and if you give configuration access to someone you don't give shell access to, they could use that to escalate to a shell), but the Collector config also lets you read and write arbitrary files, as well as open arbitrary network connections, all of which can easily be escalated to running a command. You need to treat the Collector config as sensitive either way. No, the
In isolation for a single exporter, it's better, but I think as an ecosystem of exporters, it's worse because you can only use a limited subset of exporters (only the ones that are written in Go and have adopted the new interface).
I can try to put something together, but I'm not sure I'm familiar enough with the high-level Prometheus direction to know which style would be preferred (HTTP or gRPC? OTLP, text format, or proto format metrics? etc.)
Great question. I agree that this is potentially a significant blocker for OTel Collector adoption - today, most people use kitchen-sink binaries that contain ~every public component, but if the ecosystem grows as designed, that's not going to be viable going forward.
I think what I suggest above (running a child process with HTTP or gRPC) is very similar to how Terraform plugins get used. I'm not aware of any proposals along those lines in the OTel world right now. |
|
I've discussed this offline with @roidelapluie, and he raised some good points that the exporter-toolkit may not be the best place to build this interface, if we decide to do so. The exporter toolkit was designed to facilitate HTTP interactions, and creating a scraper and adapter seems far from the original design. I'll update the proposal to mention that we'll create a stand-alone Go library for this work. I also see the dependency hell problem, still figuring that out 🤗 |
That's fantastic. That allows integration with exporters written in any language, and the Go dependencies problem wouldn't exist. However, specifically for Go, this approach feels like doing extra work to have a client_model.dto encoded into Prometheus exposition format to then be converted back to client_model.dto and finally to OTel's format. @quentinmit, another problem I see with We're dreaming a little bit here, though. I don't want to let "Perfect" get in the way of "Progress". I'd be happy to collaborate with anyone who wants to push this plugin system forward, but it feels too far away at the moment.
We don't expect you to write a perfect proposal from the start; you can suggest your preferences, and the community will give you feedback on their preferences as well. Just like how you're doing here :) |
beorn7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only I few high-level remarks.
Neither my OTel nor my exporter knowledge is detailed enough to judge the detailed design (interfaces and such). @SuperQ is probably the most important person to look at this.
I hope this will work out. Sounds like a great idea in general.
|
|
||
| ## Why | ||
|
|
||
| The OpenTelemetry Collector ecosystem faces a significant challenge: many components in collector-contrib are "drop-in" replacements for existing Prometheus exporters but often become unmaintained before reaching stability. This duplication of effort occurs because the promise of "one binary to collect all telemetry" is valuable to users, leading to reimplementation of functionality that already exists in mature Prometheus exporters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"one binary to collect all telemetry" is valuable to users
If you ask me, it's a pretty bad idea for the users, or in other words: it has a net-negative value to users.
But obviously, this discussion is out of scope for this design doc. I propose to not value the approaches here. Just state that this approach is the one taken in the OTel collector ecosystem, without valuing it either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I didn't mean to express an opinion about whether it is good or bad, I wanted to say that we noticed users choosing this path for metric collection.
|
|
||
| This issue became particularly visible during OpenTelemetry's CNCF Graduation attempt, where feedback highlighted that users often feel frustrated when upgrading versions. In response, the Collector SIG decided to be stricter about accepting new components and more proactive in removing unmaintained or low-quality ones. | ||
|
|
||
| Meanwhile, the Prometheus community has developed hundreds of exporters over many years, many of which are stable. Creating parallel implementations in the OpenTelemetry ecosystem wastes community resources and often results in "drive-by contributions" that are abandoned shortly after acceptance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow, this section should mention that Prometheus exporters are "metrics only" for obvious reasons. Therefore, this is only talking about metrics collectors. To collect other telemetry signals, other collectors are still required.
This could, BTW, be another reason to go down this path: If you need an OTel style collector anyway for the non-metrics signals, the embedding approach still allows you to plug in Prometheus exporters for metrics. (While currently, if you are not happy with the metrics from the OTel native metrics collector, you needed to run the Prometheus exporter in addition to the OTel collector just for the metrics.)
|
|
||
| * Enable embedding of Prometheus exporters as native OpenTelemetry Collector receivers via the OpenTelemetry Collector Builder (OCB). | ||
| * Reduce duplication of effort between Prometheus and OpenTelemetry communities. | ||
| * Maintain the "single binary" promise for users who want comprehensive telemetry collection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * Maintain the "single binary" promise for users who want comprehensive telemetry collection. | |
| * Maintain the "single binary" promise for users who follow the single binary approach. |
Users who want comprehensive telemetry collection can also have it with multiple binaries. The need for a single binary really only arises from the wish for a single binary itself.
|
|
||
| ### Overview | ||
|
|
||
| Prometheus exporters function similarly to OpenTelemetry Collector receivers: they gather information from infrastructure and expose it as metrics. The key difference is the output format and collection mechanism. Prometheus exporters expose an HTTP endpoint (typically `/metrics`) that is scraped, while OpenTelemetry receivers push metrics into a pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And Prometheus exporters are only concerned with metrics, no other telemetry signals.
|
|
||
| ### Known Problems | ||
|
|
||
| 1. **Dependency conflicts**: Prometheus exporters and OpenTelemetry collector-contrib use different dependency versions. Building a distribution with both may require dependency alignment or replace directives. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be careful to not accidentally link in half of the OTel ecosystem into a normal exporter build. In principle, I assume that the exporter build should be mostly unaffected, but we know how things go if you do some innocuous import of some client library…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's inevitable to introduce otel dependencies while building the adapter, but I see three ways forward:
- A Prometheus exporter does not implement the interfaces and continues not to be embeddable
- A Prometheus exporter implements the interfaces in its main module and inherits the otel dependencies
- A Prometheus exporter creates a separate module that will be embeddable. The original module remains intact
|
|
||
| 2. **Scope of adoption**: It's unclear how many Prometheus exporters will adopt these interfaces. The proposal targets exporters in the `prometheus` and `prometheus-community` GitHub organizations initially. | ||
|
|
||
| 3. **Metric semantics**: Subtle differences in how Prometheus and OpenTelemetry handle certain metric types may require careful mapping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a source of friction in the different naming conventions.
The idea with naming schemas could complement this nicely and should maybe mentioned here.
Signed-off-by: Arthur Silva Sens <[email protected]>
Signed-off-by: Arthur Silva Sens <[email protected]>
No description provided.