Skip to content

run appears to be stuck (due to buffer limit?) #44

@itingliu

Description

@itingliu
Contributor
let result = try await Subprocess.run(
    .path("/bin/cat"),
    arguments: ["path/to/some/large/text/file.txt]
)

This looks like it is stuck and never returns. It is because the file content exceeds the default limit, and cat is waiting to write more to the output buffer.

Activity

iCharlesHu

iCharlesHu commented on May 15, 2025

@iCharlesHu
Contributor

For this one we decided to just remove the default buffer size limit, but still allow you to set one if you choose to.

self-assigned this
on May 15, 2025
weissi

weissi commented on May 23, 2025

@weissi

That doesn't sound like the right fix. If something blows past the 'maximum bytes collect' limit, you should throw an error (potentially giving the user what we have so far).

Just raising the buffer won't help, try cat /dev/zero

iCharlesHu

iCharlesHu commented on Jun 10, 2025

@iCharlesHu
Contributor

Update: we decided to throw an error instead of raising the limit. This is the same default behavior with AsyncBufferSequence.LineSequence

iCharlesHu

iCharlesHu commented on Jul 3, 2025

@iCharlesHu
Contributor

Update: in addition to throwing an error we now also require bufferLength to be set by the developer.

LiarPrincess

LiarPrincess commented on Jul 9, 2025

@LiarPrincess

Things are a bit complicated here, and I think the weissi answer lacks some context. It may be because there was a behind the scenes conversation - not posted on the github.

The biggest question is: what do users expect?

git status

I will use git status for all of the examples; think of it as a Schrödinger process that can print a little, but it can also print a lot.

git status without limit

Users expect to get the full output of a command, regardless of the size, because they explicitly asked for “without limit”. While reading the output we have a potentially unbounded buffer - those are never comfortable, but there are some practical uses for “Just give me everything, I don't want to play the limit guessing game.”.

This is similar to how AsyncSequence has unbounded buffer by default - the stdlib just trusts that the users know what they are doing.

git status with 60kb limit

Most of the users expect that the subprocess reads the first 60kb and then throws an error if there is something more. Exactly when the error happens is an entirely different conversation.

There are alternatives to throwing an error.

git status

This is the situation mentioned in the issue, but if you read closely it is about the deadlock. Regardless of whether an error is thrown or not, I think that we can all agree that this should never deadlock.

That doesn't sound like the right fix. If something blows past the 'maximum bytes collect' limit, you should throw an error (potentially giving the user what we have so far).

I do not think that “throw or not throw” is the question here. I think it is about the default behavior:

  • no limit by default (unbounded buffer) - act exactly as git status without limit above
  • limit by default (bounded buffer) - act exactly as git status with 60kb limit above

We choose the default value for the argument, and the throw (or no throw) follows.

Questions

This whole thing creates a flowchart of questions that need to be answered. There may be some other questions here, but I'm not getting paid for this anyway.

1. Should the default value for stdin/stderr be discard or collect?

“Discard” is nice because it does “nothing” by default. If you want something to happen then you have to explicitly specify it by providing the value for the standardOutput argument. This is good for readability and code reviews.

OTOH, “collect” is a good option as this is what users expect. The big problem is that some people may not know that this is the default and end up with tons of memory consumed for something that they do not need.

2. If “collect” is the default: should the buffer be unbounded or bounded+throw?

Unbounded may consume tons of the memory (possible halt/crash), but the same is true for AsyncSequence default.

Bounded+throw is definitely a safer option, but most of the people do not expect git status (or cat big_file.txt) to throw. You need to read the documentation carefully, and notice that this argument has a limit in the default value. “RTFM” exists because people are not good at reading documentation.

If we use CI/CD as an example: as our code base grows so does the output of the swift build, and at some point it may exhaust the limit, which will will throw. Will people be surprised about this?

3. When we reach he limit do we throw or return normally?

We need the bounded mode anyway (regardless of whether it is the default), so the questions 3 and 4 are for this scenario.

Alternative to throwing the error:

  1. Read the first 60kb
  2. Discard the rest
  3. Return normally, but notify the user that the output is truncated.

Throwing an error is a “more expected” outcome.

4. When do we throw the limit error?

Let's assume that we are at the exact moment that the limit was exhausted.

Option A: Wait for the child process to finish and then throw the error. To archive this we need to continue reading (and discarding) the output (both stdout/stderr) to prevent deadlock with filled pipe, then throw once we confirm the termination. We may end up in a situation when the child process exits successfully, and yet we have to throw the error because of the limit. This error has to contain the exit status of the process etc.

Option B: Throw at the exact moment we reached the limit. What happens to the child process?

  • Do we kill it? This is problematic as the swiftlang/swift-subprocess library has no way of knowing if this is safe. Things may go very wrong if we kill the process in the middle of a distributed database update.
  • Let it continue? We throw an error, and the process continues running. Slightly weird.

Trivia

Just raising the buffer won't help, try cat /dev/zero

If you want an infinite stream you can just use yes instead of cat /dev/zero.

iCharlesHu

iCharlesHu commented on Jul 9, 2025

@iCharlesHu
Contributor
  1. Should the default value for stdin/stderr be discard or collect?

The current default values for stdin, stdout, and stderr are .none (no input), .string (collect as String), and .discarded. I believe this is the most common use case. Instead of setting the output to .string by default, we propose removing this default setting and .string itself. This means the caller will have to explicitly choose an output type and buffer size. We chose this approach because it aligns with Swift’s overall design principle: if we can’t provide a reasonable default (and in this case, we truly can’t), we leave it to the caller to decide. This way, you can pass .string(limit: .max) if you want the git status without limit behavior.

  1. If “collect” is the default: should the buffer be unbounded or bounded+throw?

This is a good question which is why we decided to not offer a default at all. You'll have to pick one according to your needs.

  1. When we reach the limit do we throw or return normally?

I think what you’re describing is a choice between a "soft" and a "hard" limit. We opted for a hard limit here because, for many collected output types, it’s not possible to arbitrarily limit the output and expect it to still be convertible to the target output type. For instance, converting a truncated buffer to a String might not always work due to incorrect grapheme cluster breaks. In my opinion, if you want a soft limit, you should use the custom closure run. This way, you can decide how to process potentially partial data and determine what to do (send a signal, etc.) when the limit is reached.

  1. When do we throw the limit error?

We decided on throwing the error at the exact moment the limit is reached since there's no point return partial data as discussed above. At which point we will attempt to gracefully terminate the child process by running teardownSequence.

Things may go very wrong if we kill the process in the middle of a distributed database update.

Hence we implemented graceful shutdown via teardown sequence. Any well written apps should respond gracefully to SIGTERM to finish up any pending writes and terminates itself gracefully to avoid data losses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

blockerShow stopping issues for 0.0.1bugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

    Development

    Participants

    @weissi@iCharlesHu@LiarPrincess@itingliu

    Issue actions

      `run` appears to be stuck (due to buffer limit?) · Issue #44 · swiftlang/swift-subprocess