-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Revert "fix(redisotel): fix the situation of reporting spans multiple times (#3168)" #3433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… times (redis#3168)" This reverts commit 5314a57.
Very sorry for the impact caused. I noticed that a new PR has been submitted to address it. If you need any assistance, please feel free to let me know. |
Additionally, could you provide some examples to describe your issue? In my application, it did indeed generate duplicate links. I executed the same code in different versions, and here are the results. Code section: package otel
import (
"testing"
"github.com/redis/go-redis/extra/redisotel/v9"
"github.com/redis/go-redis/v9"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
"go.opentelemetry.io/otel/sdk/trace/tracetest"
)
func TestOTel(t *testing.T) {
sr := tracetest.NewSpanRecorder()
provider := sdktrace.NewTracerProvider(
sdktrace.WithSpanProcessor(sr),
)
t.Cleanup(func() {
_ = provider.Shutdown(t.Context())
})
client := redis.NewClusterClient(&redis.ClusterOptions{
Addrs: []string{
"localhost:7001",
"localhost:7002",
"localhost:7003",
},
})
require.NoError(t, redisotel.InstrumentTracing(client, redisotel.WithTracerProvider(provider)))
assert.Len(t, sr.Ended(), 0)
client.Ping(t.Context())
for _, span := range sr.Ended() {
t.Logf("Span: %s, Name: %s, Status: %s", span.SpanContext().TraceID(), span.Name(), span.Status().Code.String())
}
} Results before modificationversion:
output:
We can see that it generated 2 ping records. Modified resultversion:
output:
And here, it's only once. |
@monkey92t feel free to review this and weight in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR reverts a previous fix that removed default tracing hooks from ClusterClient and Ring Redis clients. The revert addresses a race condition where existing nodes lose tracing capabilities when only new nodes receive tracing hooks during instrumentation.
- Restores default tracing hooks for ClusterClient and Ring types to ensure all nodes have tracing
- Maintains the OnNewNode callbacks for new node instrumentation
- Provides a short-term solution while a more comprehensive fix is being developed
@@ -30,6 +30,8 @@ func InstrumentTracing(rdb redis.UniversalClient, opts ...TracingOption) error { | |||
rdb.AddHook(newTracingHook(connString, opts...)) | |||
return nil | |||
case *redis.ClusterClient: | |||
rdb.AddHook(newTracingHook("", opts...)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a default tracing hook with empty connection string followed by OnNewNode hooks will result in duplicate spans for the same operations. The original PR #3168 that this reverts was specifically fixing this multiple span reporting issue. This revert reintroduces the same problem it was meant to solve.
Copilot uses AI. Check for mistakes.
@@ -38,6 +40,8 @@ func InstrumentTracing(rdb redis.UniversalClient, opts ...TracingOption) error { | |||
}) | |||
return nil | |||
case *redis.Ring: | |||
rdb.AddHook(newTracingHook("", opts...)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a default tracing hook with empty connection string followed by OnNewNode hooks will result in duplicate spans for the same operations. The original PR #3168 that this reverts was specifically fixing this multiple span reporting issue. This revert reintroduces the same problem it was meant to solve.
rdb.AddHook(newTracingHook("", opts...)) |
Copilot uses AI. Check for mistakes.
@flc1125 I've modified your example to include a ping before instrumenting tracing which results in one or no spans. I was unclear yesterday how this was occurring in my actual application, we don't call EDIT: I did just find that we call err := client.ForEachShard(t.Context(), func(ctx context.Context, shard *redis.Client) error {
t.Log("some other setup")
return nil
}) func TestOTel(t *testing.T) {
sr := tracetest.NewSpanRecorder()
provider := sdktrace.NewTracerProvider(
sdktrace.WithSpanProcessor(sr),
)
t.Cleanup(func() {
_ = provider.Shutdown(t.Context())
})
client := redis.NewClusterClient(&redis.ClusterOptions{
Addrs: []string{
"localhost:7001",
"localhost:7002",
"localhost:7003",
},
})
client.Ping(t.Context()) // hard-coded but if there is a race between setting up tracing and setting up nodes then there will be no spans
require.NoError(t, redisotel.InstrumentTracing(client, redisotel.WithTracerProvider(provider)))
assert.Len(t, sr.Ended(), 0)
client.Ping(t.Context())
for _, span := range sr.Ended() {
t.Logf("Span: %s, Name: %s, Status: %s ", span.SpanContext().TraceID(), span.Name(), span.Status().Code.String())
}
} Results from v9.7.3
Results from v9.8.0
|
Verified that tracing is flowing again after making sure instrument tracing is first in our setup, but there might still be a gap. I know that a cluster can have a proxy in-front of the nodes and believe that it can handle some commands like |
This reverts commit 5314a57 (PR) which removed the default tracing hook. We've observed that since v9.8.0 our clients no longer emit traces, I believe this is due to a race condition between nodes getting created and instrumenting tracing. Before v9.8.0 there would always be a default tracing hook, and in the latest version there would only be tracing hooks for new nodes which means any existing nodes would not have tracing.
I've proposed another change that adds tracing hooks for existing nodes, #3432. That requires a breaking change to include context in the params, so I think the revert would make sense in the short term.
cc @ndyakov and @flc1125 since you all were involved in the PR I'm proposing we revert.