Skip to content

some "Unix" targets don't support most Unix idioms #141838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
workingjubilee opened this issue May 31, 2025 · 8 comments
Open

some "Unix" targets don't support most Unix idioms #141838

workingjubilee opened this issue May 31, 2025 · 8 comments
Labels
A-cfg Area: `cfg` conditional compilation C-bug Category: This is a bug. O-unix Operating system: Unix-like T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@workingjubilee
Copy link
Member

This is a persistent issue with target_family = "unix" being defined by targets that happen to have a somewhat Unix-like compiling-for-them experience but do not have a Unix runtime. This causes considerable amounts of working around them in configuration code for libraries, since you cannot simply treat them as Unix.

Some examples:

@workingjubilee workingjubilee added the C-bug Category: This is a bug. label May 31, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label May 31, 2025
@ivmarkov
Copy link
Contributor

ivmarkov commented Jun 1, 2025

This causes considerable amounts of working around them in configuration code for libraries, since you cannot simply treat them as Unix.

While removing cfg(unix) for these targets would indeed cause the "CtrlCs" of this world to now be opted out automatically, and same for processes and Unix domain sockets, now you have to put more work to explicitly opt in same or different crates into knowing that these targets which are !cfg(unix) actually do support BSD sockets and other Unix-y stuff (not that once these crates take the cfg(unix) route, there are no additional per-OS branches inside, including for "ESPIDF"):

cfg(unix) is just too coarse-grained :(

With that said, if cfg(unix) is indeed considered to mean a multi-process POSIX OS (or suchlike), then !cfg(unix) might indeed be the way to go. I'm just not sold on the argument that we should do this because it will result in less amount of workarounds.

@workingjubilee
Copy link
Member Author

Mostly what I'm noticing, @ivmarkov, is that even if we e.g. address various stdlib APIs with a solution like cfg(target_has_processes) or cfg(target_has_threads), we don't touch on artifacts that Rust generally doesn't consider part of itself, but are still core to the platform and of interest to developers, like signals.

@workingjubilee
Copy link
Member Author

And I certainly must express at least some doubt that we want to touch signals in the stdlib.

@ivmarkov
Copy link
Contributor

ivmarkov commented Jun 1, 2025

Mostly what I'm noticing, @ivmarkov, is that even if we e.g. address various stdlib APIs with a solution like cfg(target_has_processes) or cfg(target_has_threads), we don't touch on artifacts that Rust generally doesn't consider part of itself, but are still core to the platform and of interest to developers, like signals.

cfg(unix) already means two things: (a) a proclamation for the user that the OS is unix-like (b) a way to do code-branching in stdlib itself.

I'm not following why we can't have cfg(target_has_signals) even if just for the purpose of (a) and not for (b)?

@workingjubilee
Copy link
Member Author

workingjubilee commented Jun 1, 2025

I suppose we could, but that raises the question: how many such cfg do we add that ESP-IDF would fail before we start wishing to have answered differently on cfg(unix)?

@Noratrieb Noratrieb added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. O-unix Operating system: Unix-like A-cfg Area: `cfg` conditional compilation and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Jun 1, 2025
@workingjubilee
Copy link
Member Author

@ivmarkov To be clear, I'm not trying to inconvenience you specifically, I'm just thinking about how we can relatively-ergonomically support various use-cases without every Rust crate author having to hand-roll their own list of platform support.

@ivmarkov
Copy link
Contributor

ivmarkov commented Jun 1, 2025

@ivmarkov To be clear, I'm not trying to inconvenience you specifically, I'm just thinking about how we can relatively-ergonomically support various use-cases without every Rust crate author having to hand-roll their own list of platform support.

Sure. And cfg(unix) does (or should) indeed mean "you have signals" on all OSes except on the microcontroller ones.
For microcontrollers (ESP-IDF and Nuttx at least), it is just messy either way:

  • If you pull them out of cfg(unix), the case where signals have to be disabled for those is solved cleanly without additional cfg;
  • However - now - the case that these actually DO support BSD INET sockets (and maybe a bunch of other stuff) needs additional OR cfg(target_os = "thisorthat") conf next to the cfg(unix) one.

So in either case, for microcontrollers you do need additional confs. Because cfg(unix) is just too coarse grained.

But also - either way is no big deal for us.

If T-libs decides to pull us out of cfg(unix) - say - because the status quo gives the wrong "impression" w.r.t. system caps to the users - we need to update all downstream unix-y crates known to work with ESP-IDF - as well as STD itself - with the above OR cfg(target_os = "thisorthat") conf so that they can still take their "unix" code path where BSD INET sockets live. Annoying but quite doable.

It is only that the issue was seemingly opened not with the "impression" argument in mind, but with the argument that "This causes considerable amounts of working around them in configuration code for libraries" and what I'm saying is that either way you decide to treat microcontrollers - "some amount of working around them in configuration code" is just unavoidable. And that my gut feeling is still that with cfg(unix) "on" for microcontrollers, we have better chances for crates to typecheck without or with less changes, and yeah, eventually fail at runtime for - say - processes, which might actually be acceptable.

===

BTW - and sorry if clear - but running most "off the shelf" STD crates from crates.io on microcontrollers without any modifications is a chimera anyway.

When you have a total of ~ 200KB for the .bss and heap of your app (MCUs total RAM is ~ 400KB for the beefier ones) memory optimizations are almost certainly necessary. In the lucky cases, just reducing some buffer sizes, in the not so lucky ones - more obtrusive memory opts or just giving up on the crate altogether as it was not designed with memory constrained environments in mind to begin with.

Popular crates, which are thin, mostly zero-cost wrappers around the OS, like STD itself and the networking crates I enumerated above; async runtimes; utilities like url, time, chrono, random etc. - these do fit the bill and do work without or with relatively small modifications (which - again in my experience are smaller with cfg(unix)!). But I don't think I can generalize this to ALL STD crates on crates.io or even to the majority of those. Only very-popular crates receive the amount of memory-banging so that they are a good fit for microcontrollers.

Just mentioning because this whole "let's make sure downstream crates have an easier life by selecting the more correct cfg(unix) vs !cfg(unix) for microcontrollers" seems a bit like a storm in a teacup to me given that the crate has to be memory-optimized to begin with, which often induces changes to it anyway, or considerably reduces the crate selection options.

===

I suppose we could, but that raises the question: how many such cfg do we add that ESP-IDF would fail before we start wishing to have answered differently on cfg(unix)?

I don't follow.
In a perfect world where we have the cfg(have_*) capability system, cfg(unix), cfg(windows) and cfg(target_os = "thisorthat") become obsolete for downstream crates. So whether ESP-IDF is cfg(unix) or not becomes an irrelevant question.

Alas, we are not there, hence the whole discussion, I guess.

@mkroening
Copy link
Contributor

We are facing this issue too with Hermit. Hermit has traditionally had custom system calls, but nowadays, we are moving towards POSIX in many ways, but not in all. We are committing to POSIX networking and file descriptors nowadays, but explicitly don't support spawning processes or signals.

So when we are adding support for Hermit in popular crates such as Tokio, we mostly need to change cfg(unix) to cfg(any(unix, target_os = "hermit")). When considering upstreaming the Hermit support, we even considered if it was worthwhile making Hermit cfg(unix) and instead opting out of all the things that Hermit does not support, but that's probably not the right thing to do.

So having something more fine-grained like cfg(posix_net), cfg(posix_fd) would be great for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cfg Area: `cfg` conditional compilation C-bug Category: This is a bug. O-unix Operating system: Unix-like T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants