Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: "Plugable backend" Tracing Client/Query library #193

Open
lucasponce opened this issue Dec 9, 2021 · 16 comments
Open

Proposal: "Plugable backend" Tracing Client/Query library #193

lucasponce opened this issue Dec 9, 2021 · 16 comments

Comments

@lucasponce
Copy link

lucasponce commented Dec 9, 2021

Hello all,

Apologizes in advance as I am not sure if this repo is the proper one to start a proposal, please, feel free to redirect me to the right channel.

I'd like to ask if the OpenTelemetry specification is considering to create an "agnostic" backend library to consume traces stored in any tracing platform (Jaeger, Tempo, Zipkin).

Today, applications can use opentelemetry sdk to ingest traces and spans in a common format. Projects like Kiali https://kiali.io/ consume metrics, traces, logs, configuration to correlate and combine to offer to the user added value in the Service Mesh domain.

One of the request from users is the possibility to change the tracing platform i.e. kiali/kiali#4278 but unfortunately if you want to query traces for a specific platform you'd require some technical dependency on that platform (in the Kiali case, this one is Jaeger).

There are proto definitions for a gRPC service but I think on this side it's not obvious to change from one platform or other (or at least I'm not aware of an effort in the open-telemetry group around this).

So, one possibility is to wait that the specific implementations (Jaeger, Zipkin, Tempo, others) would implement a common API for backend queries, but for backend applications it would be nice to have a single dependency (i.e. opentelemetry-backend) that can enable to query one or another backend in a common format.

In the Kiali project this is something we'd like to offer to the users and we'd like to explore the possibilites to create a proxy client [2] that can abstract the consumer of the traces/spans about the details of specific implementations (Jaeger, Tempo, others) for easy change from one platform to another.

We think this effort could be interesting for the OpenTelemetry community and it would be nice if it's fostered under the OpenTelemetry umbrella allowing users to participate more actively.

My goal here would be to get feedback about:

  • Is this idea interesting for the OpenTelemetry community ?
  • If there is some similar project started, we wouldn't like to start anything new but join forces and collaborate with it.
  • If not we could volunteer to start some PoC around, in similar fashion like [2] but trying to help to a wider audience, not only to meet some specific requeriments of the Kiali project.

Any feedback would be welcome.

Thank you !
Lucas

[2] https://github.com/lucasponce/jaeger-proto-client

@zirain
Copy link

zirain commented Feb 25, 2022

@lucasponce any update about this? we're facing same request.

@lucasponce
Copy link
Author

Hi @zirain, I was waiting to have some feedback.
But I think we could work in two directions:

  • One could be to start a tracing-backend-library, inspired in this POC https://github.com/lucasponce/jaeger-proto-client that can plug any jaeger solution (jaeger/tempo) for consumers. I could start using it in the Kiali project.
  • Next, with the feedback, we could shape a common API and try to promote it as a specification that Jaeger, Tempo, and OpenTelemetry, in general, could adopt in a native way.

I think it could be a good start for "backend" consumers, and probably the current API of Jaeger could be a good API to start adding potential modifications that the community would need.

How does sound? If there are several parties interested I think that the project would be easier to maintain.

@zirain
Copy link

zirain commented Feb 25, 2022

I have some rough idea, send to you via Slack.

@Kampe
Copy link

Kampe commented Apr 1, 2022

I think it could be a good start for "backend" consumers, and probably the current API of Jaeger could be a good API to start adding potential modifications that the community would need.

Hi! Community member here! Happen to be downstream of this decision as a user of Kiali that can't query Tempo's Jaeger backend. I'd definitely +1 starting with Jaeger APIs that already exist ;)

@vikasmalhotra08
Copy link

Hi @lucasponce ,

I wanted to understand what is going to be the lift to get OpenTelemetry working with Kiali? I would be happy to work on it but if someone can guide me in terms of effort required, would love to try it out as there is demand internally to get this working.

@lucasponce
Copy link
Author

Hi @vikasmalhotra08,

Thanks for the interest.

Speaking for Kiali, the need that triggered this proposal is to have a unified way to query the backend storage of the telemetry data (in the first step, the tracing data).

OpenTelemetry standard is focused on a standard for instrumented applications for collectors, but there is a gap (or no standard) in a common API to be available to query the stored data.

Kiali integrated with Jaeger, meaning that we build the proto definitions to have a client query tracing data; it happens the same with Prometheus and the PromQL language.

But also there is interest from the community to be able to use other tracing solutions (like Tempo), so, we think that instead to have to implement/import N clients (one per product), it could be nice to have a unified client that can proxy via configuration these APIs (so, apps that consume telemetry can always query in the same way, and it's a configuration change to use Jaeger or Tempo).

I think this effort can be reused by anyone and potentially extended not only for tracing but other signals (telemetry, logs?) in the future, and the best way to host can be the open-telemetry group.

Perhaps in terms of being practical, what I think can move this is to have a PoC/first project called something like "open-telemetry-backend-client" that can be the artifact that Kiali or any other project can use to query tracing data.

And this client should be responsible to deal with the details of the API of any provider (Jaeger, Tempo) and it can be a configuration aspect to define which is the backend where you connect to.

Ideally, I guess it would be nice that there is a standard that any provider can follow, then there wouldn't be the need for this proxy-client, but until then I think we need one.

I had one experiment about this in this repo https://github.com/lucasponce/jaeger-proto-client, but the idea would be that from the app that needs data from the providers, there would be a single dependency to connect to them.

I didn't have time to progress in the PoC, but if the open-telemetry group would be willing to foster a repo for it, I would be happy to update that client to see if it helps to move this initiative.

@ceastman-r7
Copy link

@tigrannajaryan Any thoughts on this request?

@tigrannajaryan
Copy link
Member

@tigrannajaryan Any thoughts on this request?

I am not sure I understand the request. Is this about having an OpenTelemetry-standardized API (and possible a query language) to query the telemetry that is already stored in a backend like Jaeger?

@lucasponce
Copy link
Author

I am not sure I understand the request. Is this about having an OpenTelemetry-standardized API (and possible a query language) to query the telemetry that is already stored in a backend like Jaeger?

In a nutshell, there is a need to query telemetry data stored in backends.

There is no strong "standard" defined in this area, and having something in that direction would help that the "consumers" of these telemetry data don't need to implement N clients (one per product).

Probably there could be a common set of features that can be "standardized," and the OpenTelemetry group may foster a "standard client" for these needs.

I see there is some interest in this area; other folks like @itaykat may be interested as well.

(I may not have enough bandwidth to push for this effort, but happy to share thoughts and ideas on this issue, I think there is a need there, and several players may benefit if this effort is started).

@tigrannajaryan
Copy link
Member

In a nutshell, there is a need to query telemetry data stored in backends.

This is not very well aligned with the current vision of OpenTelemetry. Otel in the past mostly stayed away from backend functionality. See https://opentelemetry.io/docs/ where we say:

OpenTelemetry, also known as OTel for short, is a vendor-neutral open-source Observability framework for instrumenting, generating, collecting, and exporting telemetry data such as traces, metrics, logs.

See how it stops at "exporting". With some small exceptions we consider our job done once the telemetry is delivered to the backend.

That said, I do understand that how it can valuable to have a telemetry query standard. If there is enough interest in this then potentially Otel may consider working on it. This is a major initiative though and initiatives like this typically are started by folks who are interested in the topic and form a workgroup and propose it as a project (which may or may not be accepted by Otel). See more about relevant processes here and here. Step 1 would be to find sufficiently large group of people with relevant experience willing to spend significant time on this initiative.

@itaykat
Copy link

itaykat commented Sep 7, 2022

(I may not have enough bandwidth to push for this effort, but happy to share thoughts and ideas on this issue, I think there is a need there, and several players may benefit if this effort is started).

@lucasponce @tigrannajaryan do you know of any other telemetry consumers (services, tools, platforms, frameworks, organizations, etc.) that may benefit from such an API standardization other than the ones mentioned in this issue?

@lucasponce
Copy link
Author

Thanks, @tigrannajaryan.

Well, the motivation of this proposal is that the "observability" space is getting mature, and as soon as more "senders" try to align with "open-telemetry" standards to "send/collect" the data; on the other side, the "consumers" (like Kiali, or others), need to deal with the N backend APIs to consume and process this information.

@itaykat, I don't have formal visibility of more consumers, but I guess that sooner than later, at the same rhythm that the "open-telemetry" adoption grows, there is a need to progress in this direction as well.

@ceastman-r7
Copy link

For instance: https://github.com/grafana/tempo is an alternative to jaeger (i.e. it collects tracing data using standards like otel) but it's backend api is not the same because there is no current standard for backend tracing apis.

@itaykat
Copy link

itaykat commented Sep 12, 2022

So I've started creating a specification for such an API, there are some more insights around this topic from our customers at Epsagon, which is pretty helpful, more insights and validations are welcomed.

Where do you think can be a good place to share the work for such an in-progress specification?

  1. OpenTelemetry OTEP
  2. OpenTelemetry Project
  3. Dedicated GitHub repo within the OpenTelemetry organization (something like 'opentelemetry-backend-specification')
  4. External GitHub repo (the rational may be that this is out of scope from the OTel perspective)
  5. Some other collaboration tool

Any other method of collaboration is welcomed. Please share your take, I want to push this initiative forward.

@tigrannajaryan
Copy link
Member

I think this is likely larger than can be discussed an accepted in a single OTEP. It will likely need to be an ongoing project. However, if you propose a project right now it is unlikely to be accepted without sufficient supporting evidence and building that evidence takes time (see project requirements, particularly for staffing, leading, sponsors, etc). It may be best to create your own repository and start working on it while looking for more people to help.
In the past we have also used Google documents for easier group collaboration (e.g. Logging SIG started with the discussion of the Log data model in a document).

@itaykat
Copy link

itaykat commented Sep 14, 2022

@tigrannajaryan Thanks you for advising here, can you pleas take a look at the OTEP I opened today? we can move the conversation over there.
@lucasponce FYI.

@yurishkuro yurishkuro changed the title Proposal: "Plugable backend" Tracing library Proposal: "Plugable backend" Tracing Client/Query library Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants