Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: Switch to accepting sync input closures, adjust defaults, add docs #9

Merged
merged 10 commits into from
Jan 2, 2025

Conversation

jlizen
Copy link
Owner

@jlizen jlizen commented Dec 26, 2024

Overview

In the previous iteration we were accepting futures and trying to execute them in a way that accommodated blocking. This worked fine for the secondary tokio runtime, but was flawed for the spawn_blocking and block_in_place executor. Because, in fact they were spawning delegated tasks right back to the worker thread.

After thinking about it more, it has become clear to me that the most immediate benefit of this library will be to instead allow libraries to provide sync inputs, not async. It's a convenience to also handle async, but the use cases I'm currently tracking (tls handshaking, high-cardinality metric publication), reduce to sync segments when handled most optimally.

To quote from the new README:

Today, when library authors are write async APIs, they don't have a good way to handle long-running sync segments.

An application author can use selective handling such as tokio::task::spawn_blocking() along with concurrency control to delegate sync segments to blocking threads. Or, they might send the work to a rayon threadpool.

But, library authors generally don't have this flexibility. As, they generally want to be agnostic across runtime. Or, even if they are tokio-specific, they generally don't want to call tokio::task::spawn_blocking() as it is
suboptimal without extra configuration (concurrency control) as well as highly opinionated to send the work across threads.

This library aims to solve this problem by providing libray authors a static, globally scoped strategy that they can delegate blocking sync work to without drawing any conclusions about handling.

And then, the applications using the library can either rely on the default strategy that this package provides, or tune them with their preferred approach.

Changes

Shuffle around traits to use execute_sync and related executors

Also did some package reorg.

The core trait now is:

pub(crate) trait ExecuteSync {
    /// Accepts a sync function and processes it to completion.
    async fn execute_sync<F, R>(&self, f: F) -> Result<R, Error>
    where
        F: FnOnce() -> R + Send + 'static,
        R: Send + 'static;
}

Notably it is async, this is important since we are essentially trying to delegate the blocking segment to another thread, so that we can await its completion in the current thread while sleeping.

I opted to keep sync in the naming everywhere at the cost of verbosity. I did this for two reasons:

  • make it more explicit what it's doing
  • leave the door open for adding async executor handling as well down the road without breakage

Remove strategies

These are removed:

Secondary tokio runtime

Mostly useful for comingled async/sync, suboptimal threadpool for pure sync.

Block_in_place

This does apply to sync, but it just has so many footguns with how it breaks code running concurrently on the same task and how it can starve out your worker threads. I want to introduce friction to using it by requiring that it be a custom executor, so that people don't blame library authors if it breaks them.

I did include it in a doc example for the custom executor, linked to block in place docs, with ample warnings.

Changes to CustomExecutor

So the custom executor now accepts a sync closure rather than async. This makes it somewhat easier to name.

However, I still opted to keep the wrapping with a oneshot channel to keep the type out of the input/output to avoid generic bounds. Previously this served two purposes - type erasure to avoid generics in static bounds, but also to add cancellability of the future.

Now, the cancellation is no longer relevant. Theoretically we could instead go back to the Any usage that reduces allocations a bit and also preserves types for the closure to interact with (sort of, anyway, via downcasting).

But, the API becomes much uglier with Any. Also, most use cases are going to need a channel anyway (eg rayon channel). So, I left it as-is for now. Open to alternatives.

I still would ultimately like to add an API that makes types available to the closure (ref #3) but it feels like scope creep until there are people that have a use for it.

Updates to defaults

tokio feature now defaults to SpawnBlocking strategy, with concurrency limited to the cpu core count. That concurrency limit is actually pretty important, it takes spawn_blocking from being a suboptimal threadpool for compute, to a nice lightweight one that lazily spins up threads with configurable idle TTL.

We now default to tokio feature being on. I feel that this is likely the behavior that most users want. If tokio is enabled, the user gets a default that actually is probably a good default, as opposed to CurrentContext which adds concurrency control but nothing else.

I did explicitly add discussion of feature enablement aimed at both library authors + application authors in the readme + module docs.

Doc updates

Took a first stab at drafting module-level docs and a README. Also spruced up docs all over the place.

Notably, added add a rayon threadpool custom strategy in an integration test. I also have a module-level doc sample involving Rayon.


fixes #8 #7 #4

…g/custom/current_context, tweak defaults to enable concurrency limiting + use spawn_blocking for tokio feature, add more docs
@jlizen
Copy link
Owner Author

jlizen commented Dec 30, 2024

Started integrating with tokio-rustls to demonstrate the API, see: rustls/tokio-rustls#99

@rcoh
Copy link

rcoh commented Dec 30, 2024

As we stabilize this API, I wonder if we should have a way to pass configuration options during scheduling of futures.

For example, the caller could indicate that suspending the current task was acceptable (allowing the executor to use block_in_place)

src/executor/mod.rs Outdated Show resolved Hide resolved
@jlizen
Copy link
Owner Author

jlizen commented Dec 30, 2024

As we stabilize this API, I wonder if we should have a way to pass configuration options during scheduling of futures.

For example, the caller could indicate that suspending the current task was acceptable (allowing the executor to use block_in_place)

Agreed, I was considering this in #4 but figured it was scope creep for initial impl.

I think probably rather than totally arbitrary config, since caller doesn't have too much context on possibilities, we want bimodal or trimodal API pattern. Where there is 'likely to block / for a long time' or 'somewhat like / for a short time'. And then application author could configure additional strategies if they are able to eg block_in_place for middle case. Or it could default to a pass-through to the primary strategy (or non-op, tbd).

I think I probably prefer adding a net new api (ie, execute_sync_maybe_blocking()) rather than overloading the existing. Since either way the strategy is opaque to the library author, and this way the API change isn't breaking. Can hash it out later though.

Though, there is even further granularity available if we did have config (priority, etc). I could see an argument for bubbling that up, at cost of complexity for library authors? I was thinking about that from the other end (#3)

@jlizen
Copy link
Owner Author

jlizen commented Dec 30, 2024

Capturing feedback from @rcoh in offline discussion:

  • shift to just using CurrentContext with no concurrency as a default in all cases, so libraries adding the feature flag in the dependency tree doesn't shift behavior. this means that application authors will ALWAYS need at least some configuration to get benefit.
  • add a 'batteries included' helper for tokio feature that makes it trivial to configure with sensible defaults
  • accept an enum on 'likeliness to block' in the execute_sync() call that is just high currently to support later fanout of strategies
  • come up with a shorter name (we discussed sync-future-executor, but that's still pretty long)

…te_sync()` to specify likeliness to block, add helper for constructing good default tokio strategy
Cargo.toml Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
src/executor/mod.rs Show resolved Hide resolved
src/executor/mod.rs Outdated Show resolved Hide resolved
src/executor/mod.rs Outdated Show resolved Hide resolved
src/executor/mod.rs Outdated Show resolved Hide resolved
src/executor/mod.rs Outdated Show resolved Hide resolved
src/executor/mod.rs Outdated Show resolved Hide resolved
src/executor/spawn_blocking.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@jlizen
Copy link
Owner Author

jlizen commented Dec 30, 2024

@rcoh I think all your feedback has been addressed, thanks for the close read.

@jlizen jlizen requested a review from rcoh December 30, 2024 22:18
Copy link

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops never submitted these

src/executor/mod.rs Outdated Show resolved Hide resolved
src/executor/mod.rs Outdated Show resolved Hide resolved
Copy link

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like where we ended up here! left a couple more small thoughts

src/executor/mod.rs Outdated Show resolved Hide resolved
src/executor/mod.rs Outdated Show resolved Hide resolved
src/executor/mod.rs Outdated Show resolved Hide resolved
@jlizen
Copy link
Owner Author

jlizen commented Jan 2, 2025

Capturing offline feedback from rcoh@:

  • we should probably add a test that Execute is object safe
  • should add support for passing in a handle to a strategy with execute, rather than globally scoped, so that libraries can load selective strategy config (will be a separate PR, will cut an issue)

@jlizen
Copy link
Owner Author

jlizen commented Jan 2, 2025

Alrighty, pushed up changes that address your feedback @rcoh, thanks!

  • Implicitly box the custom closure for the custom executor rather than requiring caller to specify
    • Note that we still do require the caller box the input and output, since you can't pass an impl bounds into a Fn() input/output. Slightly more ergonomic though?
  • Annotate the panics for installing tokio executors outside of a tokio context, as well as point to the spawn_blocking_with_handle() for using something besides the current context

Re Execute trait object safety - it definitely is NOT object safe, since it has type parameters to tie the input function bounds to the output result.

Instead, we are using the Executor enum, with the custom closure executor as an object-safe escape hatch for allowing arbitrary execution functionality. That is the thing that we would expose most likely for allowing caller-specified strategies per execute.

@jlizen jlizen requested a review from rcoh January 2, 2025 18:23
@jlizen
Copy link
Owner Author

jlizen commented Jan 2, 2025

Issue for allowing usage of executors outside of global strategy: #10

@jlizen jlizen merged commit b89fd29 into main Jan 2, 2025
jlizen added a commit that referenced this pull request Jan 13, 2025
Refactor: Switch to accepting sync input closures, add builder, adjust defaults, add docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: Add async execute_sync() api to handle sync input functions
2 participants