Skip to content

trace-core: Dispatchers unset themselves #1033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 16, 2019
Merged

trace-core: Dispatchers unset themselves #1033

merged 11 commits into from
Apr 16, 2019

Conversation

hawkw
Copy link
Member

@hawkw hawkw commented Apr 4, 2019

This branch changes dispatcher::get_default to unset the thread's
current dispatcher while the reference to it is held by the closure.
This prevents infinite loops if the subscriber calls code paths which
emit events or construct spans.

Note that this also means that nested calls to get_default inside of a
get_default closure will recieve a None dispatcher rather than the
"actual" dispatcher. However, it was necessary to unset the default in
get_default rather than in dispatch methods such as Dispatch::enter,
as when those functions are called, the current state has already been
borrowed.

This introduces a little more overhead into span creation and events.

Before:


     Running target/release/deps/no_subscriber-09f0b39a00181a97

running 5 tests
test bench_1_atomic_load              ... bench:           0 ns/iter (+/- 0)
test bench_costly_field_no_subscriber ... bench:           2 ns/iter (+/- 0)
test bench_log_no_logger              ... bench:           0 ns/iter (+/- 0)
test bench_no_span_no_subscriber      ... bench:           0 ns/iter (+/- 0)
test bench_span_no_subscriber         ... bench:           2 ns/iter (+/- 0)

test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 0 filtered out

     Running target/release/deps/subscriber-992535c5d1c348ad

running 5 tests
test enter_span              ... bench:           3 ns/iter (+/- 0)
test span_no_fields          ... bench:          38 ns/iter (+/- 8)
test span_repeatedly         ... bench:       4,560 ns/iter (+/- 663)
test span_with_fields        ... bench:          49 ns/iter (+/- 17)
test span_with_fields_record ... bench:         363 ns/iter (+/- 71)

After:

     Running target/release/deps/no_subscriber-09f0b39a00181a97

running 5 tests
test bench_1_atomic_load              ... bench:           0 ns/iter (+/- 0)
test bench_costly_field_no_subscriber ... bench:           2 ns/iter (+/- 0)
test bench_log_no_logger              ... bench:           0 ns/iter (+/- 0)
test bench_no_span_no_subscriber      ... bench:           0 ns/iter (+/- 0)
test bench_span_no_subscriber         ... bench:           2 ns/iter (+/- 0)

test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 0 filtered out

     Running target/release/deps/subscriber-992535c5d1c348ad

running 5 tests
test enter_span              ... bench:           3 ns/iter (+/- 0)
test span_no_fields          ... bench:          48 ns/iter (+/- 9)
test span_repeatedly         ... bench:       5,274 ns/iter (+/- 596)
test span_with_fields        ... bench:          55 ns/iter (+/- 11)
test span_with_fields_record ... bench:         346 ns/iter (+/- 71)

Closes #1032

Signed-off-by: Eliza Weisman [email protected]

@hawkw hawkw added C-bug Category: This is a bug. tokio-trace labels Apr 4, 2019
@hawkw hawkw requested a review from carllerche April 4, 2019 00:20
@carllerche
Copy link
Member

Looking good 👍 I only skimmed it so far. I will review it in closer detail tomorrow.

Have you considered flagging the dispatcher as entered instead of swapping out then back? I wonder if that would make an impact in the benches you run.

Roughly, Dispatcher would have a bool entered. If that flag is set, then it does not forward any events to the subscriber.

@hawkw
Copy link
Member Author

hawkw commented Apr 4, 2019

@carllerche We still need to store that bool as part of the thread-local state, rather than on Dispatch, since Dispatch needs to be Sync (precluding the use of a Cell) as it's a field on Span (which we are now asserting is Sync).

Using a flag stored in the thread-local state rather than replacing the dispatcher doesn't seem to result in a big performance improvement:


running 5 tests
test enter_span              ... bench:           3 ns/iter (+/- 0)
test span_no_fields          ... bench:          47 ns/iter (+/- 6)
test span_repeatedly         ... bench:       5,063 ns/iter (+/- 957)
test span_with_fields        ... bench:          57 ns/iter (+/- 6)
test span_with_fields_record ... bench:         361 ns/iter (+/- 37)

However, it does mean that calls to dispatcher::get_default can be nested, as long as calls to the functions on Dispatch that need to be guarded are not also nested. This at least lets us avoid a potentially surprising edge case...

@carllerche
Copy link
Member

@hawkw looking at this, the version w/ the bool flag needs to add a lot of conditionals where the original one did not... I do wonder what made the original (Cell and swap) slower.

@hawkw hawkw requested a review from carllerche April 10, 2019 19:29
@hawkw
Copy link
Member Author

hawkw commented Apr 10, 2019

@carllerche personally, I would prefer to move forward with the bool flag rather than Cell and swap, because it means that nested calls to get_default don't behave surprisingly --- in the bool flag approach, the dispatcher is only disabled when we're actually inside it.

@hawkw
Copy link
Member Author

hawkw commented Apr 15, 2019

@carllerche commit b0c6b34 moves the check back into dispatcher::get_default, rather than guarding individual functions on the dispatcher.

This means that we don't have to access the TLS var twice, and we don't have to access it at all when entering a span handle that already exists. This pretty much entirely negates the performance hit caused by adding the check.

@hawkw
Copy link
Member Author

hawkw commented Apr 15, 2019

Looks like the build failure is due to Azure Pipelines flaking out again; I restarted it.

@hawkw
Copy link
Member Author

hawkw commented Apr 15, 2019

Benchmark results from b0c6b34:

Before:

test enter_span              ... bench:           3 ns/iter (+/- 0)
test span_no_fields          ... bench:          51 ns/iter (+/- 12)
test span_repeatedly         ... bench:       5,073 ns/iter (+/- 1,528)
test span_with_fields        ... bench:          56 ns/iter (+/- 49)
test span_with_fields_record ... bench:         363 ns/iter (+/- 61)

After:

test enter_span              ... bench:           3 ns/iter (+/- 0)
test span_no_fields          ... bench:          35 ns/iter (+/- 12)
test span_repeatedly         ... bench:       4,165 ns/iter (+/- 298)
test span_with_fields        ... bench:          48 ns/iter (+/- 12)
test span_with_fields_record ... bench:         363 ns/iter (+/- 91)

Note that the "after" benchmarks are pretty much in line with the perf on master before adding the check. :D

@hawkw
Copy link
Member Author

hawkw commented Apr 15, 2019

FreeBSD CI is also failing due to an unrelated (and possibly transient?) failure in the threadpool tests:

failures:
---- blocking_thread_does_not_take_over_shutdown_worker_thread stdout ----
thread 'blocking_thread_does_not_take_over_shutdown_worker_thread' panicked at 'called `Result::unwrap()` on an `Err` value: RecvError', src/libcore/result.rs:997:5
failures:
    blocking_thread_does_not_take_over_shutdown_worker_thread
test result: FAILED. 6 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

I restarted it.

hawkw added 11 commits April 16, 2019 11:13
This reduces the performance overhead of this change significantly. The
benchmarks are once again on par with `master`:

```
     Running target/release/deps/no_subscriber-1afb0f83feae37f5

running 5 tests
test bench_1_atomic_load              ... bench:           0 ns/iter (+/- 0)
test bench_costly_field_no_subscriber ... bench:           1 ns/iter (+/- 0)
test bench_log_no_logger              ... bench:           0 ns/iter (+/- 0)
test bench_no_span_no_subscriber      ... bench:           0 ns/iter (+/- 0)
test bench_span_no_subscriber         ... bench:           2 ns/iter (+/- 0)

test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 0 filtered out

     Running target/release/deps/subscriber-7e8d6a743535dbee

running 5 tests
test enter_span              ... bench:           3 ns/iter (+/- 0)
test span_no_fields          ... bench:          34 ns/iter (+/- 4)
test span_repeatedly         ... bench:       3,868 ns/iter (+/- 380)
test span_with_fields        ... bench:          45 ns/iter (+/- 7)
test span_with_fields_record ... bench:         333 ns/iter (+/- 48)

test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 0 filtered out

````

Thanks @carllerche for suggesting this!

Signed-off-by: Eliza Weisman <[email protected]>
Rather than guarding every dispatcher function with an entered check,
this commit moves the check back into `get_default`. This negates most
of the performance hit caused by the entered check.

Note that it may be possible to bypass the check by sneaking a span
handle into the subscriber, but AFAICT this is enough work that it would
have to be done intentionally. The purpose of this check is to guard
against *accidental* infinite loops caused by trace instrumentation in
code a subscriber calls into, so I'm much less concerned about
preventing deliberate misuse.

Before:
```
test enter_span              ... bench:           3 ns/iter (+/- 0)
test span_no_fields          ... bench:          51 ns/iter (+/- 12)
test span_repeatedly         ... bench:       5,073 ns/iter (+/- 1,528)
test span_with_fields        ... bench:          56 ns/iter (+/- 49)
test span_with_fields_record ... bench:         363 ns/iter (+/- 61)
```

After:
```
test enter_span              ... bench:           3 ns/iter (+/- 0)
test span_no_fields          ... bench:          35 ns/iter (+/- 12)
test span_repeatedly         ... bench:       4,165 ns/iter (+/- 298)
test span_with_fields        ... bench:          48 ns/iter (+/- 12)
test span_with_fields_record ... bench:         363 ns/iter (+/- 91)
```

Signed-off-by: Eliza Weisman <[email protected]>
@hawkw hawkw force-pushed the eliza/unset-dispatch branch from b0c6b34 to 84481e8 Compare April 16, 2019 18:13
@hawkw hawkw merged commit 4bfa4ff into master Apr 16, 2019
hawkw added a commit that referenced this pull request Apr 16, 2019
@carllerche carllerche deleted the eliza/unset-dispatch branch April 22, 2019 21:26
hawkw added a commit that referenced this pull request Jun 25, 2019
This branch changes `dispatcher::get_default` to unset the thread's
current dispatcher while the reference to it is held by the closure.
This prevents infinite loops if the subscriber calls code paths which
emit events or construct spans. 

Note that this also means that nested calls to `get_default` inside of a
`get_default` closure will receive a `None` dispatcher rather than the
"actual" dispatcher. However, it was necessary to unset the default in
`get_default` rather than in dispatch methods such as `Dispatch::enter`,
as when those functions are called, the current state has already been
borrowed.

Before:
```
test enter_span              ... bench:           3 ns/iter (+/- 0)
test span_no_fields          ... bench:          51 ns/iter (+/- 12)
test span_repeatedly         ... bench:       5,073 ns/iter (+/- 1,528)
test span_with_fields        ... bench:          56 ns/iter (+/- 49)
test span_with_fields_record ... bench:         363 ns/iter (+/- 61)
```

After:
```
test enter_span              ... bench:           3 ns/iter (+/- 0)
test span_no_fields          ... bench:          35 ns/iter (+/- 12)
test span_repeatedly         ... bench:       4,165 ns/iter (+/- 298)
test span_with_fields        ... bench:          48 ns/iter (+/- 12)
test span_with_fields_record ... bench:         363 ns/iter (+/- 91)
```

Closes #1032 

Signed-off-by: Eliza Weisman <[email protected]>
hawkw added a commit to tokio-rs/tracing that referenced this pull request Jul 12, 2019
The `tracing-fmt` subscriber currently fails to increment a span's
reference count when a child of that span is created, leading to a
situation where child spans can be incorrectly missing their parents.

This is due to an interaction between the changes in PR #31 and the
changes in tokio-rs/tokio#1033. PR #31 changed the `FmtSubscriber` to
round-trip calls to `clone_span` and `drop_span` through the dispatcher,
so that when a `FmtSubscriber` is wrapped by another subscriber, the
whole stack would see the ref-count updates. tokio-rs/tokio#1033 changed
the dispatcher to unset itself when servicing a dispatch call, so as to
avoid infinite loops from events within functions called by the
subscriber. In combination, this means any ref-count changes from within
the dispatcher (i.e. when `new_span` constructs a new span that clones
its parent's ID) are lost.

This branch changes `FmtSubscriber` back to calling ref-count methods on
itself directly. Although this means that it will no longer call the
ref-counting methods on a wrapper subscriber, the `Layer` type added in
PR #136 will soon obviate the need to write such subscribers, and as far
as I can tell, they are not currently being used in production.

I've also added some assertions that should help catch situations where
the `FmtSubscriber`'s span store is corrupted. Previously, we would
silently ignore missing spans, in case a method was called in a `drop`
impl (i.e., when exiting a span or dropping an ID), to avoid double
panics. I've changed these methods to check if the thread is panicking,
and only panic if we are running in debug mode and not already
panicking.

Thanks to @samscott89 for finding this bug!

Signed-off-by: Eliza Weisman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

trace-core: When dispatching an event, the dispatcher should unset itself
2 participants