-
Notifications
You must be signed in to change notification settings - Fork 61
Description
Disclaimer: This is not attempting to solve the general fe_setenv
issue and allow changing the floating-point environment for floating-point code.
Based on rust-lang/rust#72252, it seems the following code is currently UB:
pub unsafe fn div_3_1() -> f32{
use core::arch::x86_64::*;
let x = _mm_set_ss(1.0);
let y = _mm_set_ss(3.0);
_MM_SET_ROUNDING_MODE(_MM_ROUND_TOWARD_ZERO);
let z = _mm_div_ss(x,y);
_MM_SET_ROUNDING_MODE(_MM_ROUND_NEAREST);
let mut k = 0.0f32;
_mm_store_ss(&mut k,z);
k
}
Likewise, the following code is also considered UB:
pub unsafe fn div_3_1() -> f32{
use core::arch::x86_64::*;
let x = _mm_set_ss(1.0);
let y = _mm_set_ss(3.0);
_MM_SET_ROUNDING_MODE(_MM_ROUND_TOWARD_ZERO);
let z;
asm!("divss {}, {}", inlateout(xmm_reg) x => z, in(xmm_reg) y);
_MM_SET_ROUNDING_MODE(_MM_ROUND_NEAREST);
let mut k = 0.0f32;
_mm_store_ss(&mut k,z);
k
}
Both of these are suprising, as no rust-level floating-point operations are performed that would be affected by the rounding mode - only platform intrinsics or inline assembly.
These are limited examples, but a much more general example of this is a library that implements floating-point operations (including those in non-default floating-point environments from, e.g. C or C++) using a combination of software emulation, inline assembly, and platform intrinsics.
Assuming the LLVM issue mentioned in 72252 is fixed (and llvm's code generation for the llvm.x86.sse.div.ss
intrinsic is fixed), can we call these examples defined behaviour, or is this code simply UB for some other reason?
Activity
chorman0773 commentedon Oct 19, 2023
@rustbot label +A-floats
RalfJung commentedon Oct 19, 2023
You can't know that no Rust-level floating-point operations are affected. The compiler is allowed to move floating-point operations from elsewhere to in between the SET_ROUNDING_MODE calls. This is not a bug, floating-point operations are pure operations in the Rust AM and can be reordered arbitrarily. rustc has to thus implement them in a way that their behavior does not depend on any environmental flags, and it does that by having "the rounding mode is round-to-nearest" in its representation invariant that relates the low-level target state to the high-level AM state.
So yes, this is UB.
chorman0773 commentedon Oct 19, 2023
My next question is whether it should be UB.
If even being in (any) rust code with an incorrect floating-point environment is UB, then that implicates quite a few things:
RalfJung commentedon Oct 19, 2023
FWIW C code compiled with LLVM (and likely most other compilers) has the same UB. This is not just a Rust problem. I don't know anything about any of your examples, but if a platform ABI requires FP env mutation (as you claim for point 1) then that's already a highly problematic ABI at best. Which ABI requires FP env mutation? Therefore I also don't buy your second example; if that runtime support is implemented in C, then it already implicitly assumes a default FP environment. The C ABI (on pretty much any target) requires the FP env to be in default state.
Allowing modification of the FP environment without compromising optimizations on the 99.9% of functions that are happy with the default FP settings is not easy. We'd have to add some notion of "scope within which the compiler does not assume that the FP environment is in the default state". This would have to interact properly with inlining and things like that. It's not impossible, but it requires a bunch of design work.
chorman0773 commentedon Oct 19, 2023
clang supports the
STDC FENV_ACCESS
pragma per requirements of ISO 9899. It uses constrained floating-point intrinsics. gcc also supports this pragma (again, as required by the standard). It is also not considered undefined behaviour by C to modify the floating-point environment w/o the pragma, though floating-point operations issued in a non-default FP Env may yield incorrect results according to the current FP Env - e.g. by constant-folding ops under the default rounding mode (whether it does is unspecified).Any C compiler that considers it UB to do so is not compliant with the standard and I am under no obligation to write code that supports it, nor to have my own implementation follow such broken compilers. This is also true of C++, though C++ does not require support of the pragma (all C++ compilers I'm aware of, even MSVC, support it, though).
Moreso where the FP env is stored on the ABI, so I cannot emulate the fp env in the library. Practically every ABI I'm aware of specifies the platform effects of the
fesetenv
andfegetenv
C functions.The main example is libsfp, a runtime support library used by lccc to implement floating-point operations (of various sizes, among others), including those in a function marked
#[fpenv(dynamic)]
. To comply with ISO 9899, it must respect the floating-point environment in such functions, and must not cause UB period.This is not the case of x86_64 Sys-V or MSABI. Both require that the floating-point environment is initialized to default (and specify that default), and then marks the control bits of the relevant registers (mxcsr and the x87 fcw) as callee saved, with the exception of functions that intentionally modify the floating-point environment (
fesetenv
is one). This is to support the aforementioned function, which is required by ISO 9899. Any C ABI that makesfesetenv
immediate undefined behaviour is not a complaint ABI.chorman0773 commentedon Oct 19, 2023
If this was the case, then being in an async-signal-handler is immediate UB (as I noted is the case for Rust), which is most definitely not the case.
RalfJung commentedon Oct 19, 2023
As far as I know, C/C++ code compiled with clang without special markers behaves exactly like Rust code wrt float operations, and hence has the same UB. I assume basically all C/C++ compilers will treat float operations as pure (unless some special marker is set to indicate that a scope has a non-default FP env) and hence move them around (including out of loops and out of potentially dead code). This makes mutating the FP environment UB in the real C that compilers implement, whether or not the standard agrees.
I have no interest in specifying Rust in a way that is disconnected from reality, so we should call this UB as that's what it is. Maybe it's UB because compilers are not compliant, but how's that helpful? It isn't, unless you have a proposal for how the standard can be implemented in a reasonable way.
That can't be true. When I write an
extern "C"
function, I must be able to rely on the FP env being in default state. If that wasn't the case then every single externally callable function that might use FP operations had to start by setting the rounding mode. If I set the FP env to a non-default state and then call some library and it misbehaves, I don't get to complain. There is no general expectation that libraries are resilient against non-default FP envs. (If any of this is wrong please let me know, I certainly haven't seen any evidence to the contrary.)All of this shows that the rounding mode is part of the de-facto ABI. If the documentation disagrees then the documentation doesn't reflect the contract used in real-world software.
If the async signal handler uses any float operation, then it's most definitely UB. There's also nothing in the LLVM LangRef that would forbid the compiler from introducing a new float operation into code that doesn't use a float operation. This can be as subtle as
and hoisting the operation out of the
if
(which is obviously legal for a pure operation).Sounds like async signal handlers need to be compiled with such a "FP env might be in non-default state" kind of a scope, otherwise there's no way they can be sound.
Also sounds like nobody really thought this entire FP env thing through to the end and different parts of the ecosystem made mutually incompatible choices, and now it's all busted. 🤷 It doesn't get better by pretending that it's not UB, though.
chorman0773 commentedon Oct 19, 2023
As far as I am aware gcc (and msvc) implement the behaviour as prescribed. If clang does not, I would consider that a bug in clang and certainly not any behaviour I would desire to emulate in lccc.
This is either an extra constraint imposed by rust and that does not reflect any actual C abi, or is an incorrect reliance. As far as I am aware, no C abi is not complaint with the relevant sections of ISO 9899 in this regard. Knowing the precise behaviour of the Clever-ISA abi, I can quote the relevant text, though x86_64 Sys-V is similar (albeit less formal).
chorman0773 commentedon Oct 19, 2023
(if you'd like, I can find the relevant portions of the x86_64 sys-v spec and msvc abi)
RalfJung commentedon Oct 20, 2023
GCC says "Without any explicit options, GCC assumes round to nearest or even". It's unclear what that means, but it's far from obvious that it means "GCC guarantees that the code will work correctly under all FP environments".
LLVM is very clear: "The default LLVM floating-point environment assumes that traps are disabled and status flags are not observable. Therefore, floating-point math operations do not have side effects and may be speculated freely. Results assume the round-to-nearest rounding mode, and subnormals are assumed to be preserved." You might want to bring this up with the LLVM people if you think that's an issue.
Usually @comex is very good at getting compilers to apply the right optimizations in the right order to demonstrate an end-to-end miscompilation, maybe they can do it here, too? :)
What is this claim based on? Is it some standard that says so, or are there really targets and OSes where the kernel doesn't save and restore the FP environment when switching from a thread to its signal handler and back?
Do you have any evidence that every single library with a C ABI is actually expected to be working correctly under arbitrary FP environments, and that library authors consider it a bug when their library misbehaves under non-default FP environments?
As I said before, I care not only about what it says in some piece of paper and but also about what is actually done in the real world. When standard and reality disagree, it's not automatically reality that's wrong. Sometimes the standard is just making completely unrealistic prescriptions that everybody ignores, and the standard should be fixed.
It's also unclear to me which alternative you are suggesting. Could you make a constructive proposal? Here are some options, and you can already immediately see why many people won't like them:
You are asking everyone to pay for a feature that hardly anyone needs. Is that your position, or do you see a better way out here?
If you further want to claim that even other aspects besides the rounding mode may be changed, such as making sNaNs trigger a trap, then either passing an sNaN to an FP operation is UB, or FP operations cannot be reordered at all with anything any more (e.g., reordering an FP operation and a store becomes illegal since the trap makes it fully observable when exactly an operation happens).
Muon commentedon Oct 20, 2023
Linux definitely restores the FP environment when it enters a signal handler. There was a big kerfuffle about it back in 2002 when SSE2 arrived (https://yarchive.net/comp/linux/fp_state_save.html). I think FreeBSD might be a target that actually does not restore the FP environment when entering a signal handler, but I am unsure (https://reviews.freebsd.org/D33599). In any case, glibc says that
fesetenv
is async-signal-safe (https://www.gnu.org/software/libc/manual/html_node/Control-Functions.html), so fixing this shouldn't be a problem.The SYSV ABI (https://gitlab.com/x86-psABIs/x86-64-ABI) stipulates that the FP control bits are callee-saved, meaning that the callee needs to restore them if it changes them. (Presumably an exception is intended for
fesetenv
andfesetround
, but it seems to have been forgotten.) This doesn't mean that publicly-accessible library functions using FP instructions have to be built defensively to be correct, just that they have an undocumented assumption (that the FP environment is default).The main consideration for Rust is that it is ultimately bound by LLVM's quirks. LLVM has made (is still making?) progress towards letting Clang support
#pragma STD FENV_ACCESS ON
, so something similar would be good to implement in Rust eventually. My preference would be an attribute applicable to blocks and functions that describes how floating-point arithmetic behaves within them, similar to#pragma STDC FENV_ROUND direction
.Additionally, the C23 standard (and possibly earlier revisions) specifies in Section 7.6.1 "The FENV_ACCESS pragma" that it is UB to, under a non-default floating-point environment, execute any code that was compiled with the pragma set to off.
RalfJung commentedon Oct 20, 2023
@Muon thanks! So looks like in practice, signal handlers are fine on our tier 1 targets, but other targets are having issues. (Also I heard that some versions of WSL do not save and restore the FP env for signal handlers.)
For this to be useful for compilers, the exception needs to be compiler-readable. Connor quoted above some wording saying that if the function is "documented" to change the FP state then it may do so, but of course that's not very useful.
This is similar to how setjmp needs an attribute so that the compiler can understand that something very weird is going on.
(Though floats are different in that as far as I can see, even with such an attribute there'd be a global cost.)
For Rust (and code compiled with clang) this means all functions have such an undocumented assumption.
RalfJung commentedon Oct 20, 2023
FWIW in my opinion this is a case of bad ISA design. ISAs chose to introduce some global mutable state and as usual, global mutable state is causing problems. ISAs should provide opcodes that entirely ignore the FP status register so that languages can implement the desired semantics (floating-point operations that do not depend on global mutable state) properly. But it seems like even RISCV repeats this mistake, so we'll be stuck with hacks and quirks for many decades to come. Languages can choose to either make those ISA features basically inaccessible, to penalize all users for the benefit of the tiny fraction that actually wants a non-default FP status register, or to introduce syntactic quirks that mark where in the code floating-point opcodes behave in non-default ways.
22 remaining items
chorman0773 commentedon Oct 23, 2023
In my case, at least, there aren't any floating-point operations in sight (beyond stuff in inline-assembly). Inlining would be a thing, but this code is on the other side of a staticlib/dylib and LTO is off (not that the calls that care about fp-env could possibly LTO with llvm-compiled code anyways - this is being called by lccc's codegen). I'd prefer a more well-defined solution, though this is probably good until said solution exists.
Muon commentedon Oct 23, 2023
That's delightful. I am surprised to learn that LLVM does not perform any range tracking on floating-point variables. Though I suppose if it did optimize things like that more aggressively perhaps that would expose too many bugs with its x87 handling.
RalfJung commentedon Oct 23, 2023
That's amazing, thanks a ton. :) If I truly were your boss, you'd get a promotion. :D
It would still index the wrong element though? So one could then unsafely assert that we saw the right element and we would reach an
unreachable_unchecked
that should be unreachable, and that'd still be a miscompilation?EDIT: Ah no it would of course index the right element, since it'd do the computation with default rounding mode. Yeah that is quite tricky, amazing that you found an example!
HadrienG2 commentedon Dec 12, 2024
Assuming our beloved compiler backends can fix their broken semantics to allow it, I would ague that it makes a lot of sense for Rust to provide opt-in support for FTZ/DAZ mode (ideally in selected code regions so the rest of the code is not penalized) because...
HadrienG2 commentedon Dec 13, 2024
I think what I'd love to have is something like this:
But if that's too difficult to implement, I can totally live with a function-scoped attribute (
#[flush_denormals] fn gotta_go_fast {}
).A global FTZ/DAZ compiler option would be more problematic on the other hand because some numerical algorithms do depend on proper denormals behavior for correctness. Think about e.g. iterative algorithms that run until estimated error gets below a certain threshold: in this case the error estimate computation can easily end up relying on Sterbenz's lemma for correctness, as nicely highlighted by this amazing bug report.
RalfJung commentedon Dec 13, 2024
There's quite a big design space here, e.g. one could also imagine specifying the rounding mode and other aspects like denormal handling for each operation. That'd make a lot more sense semantically, and at least some ISAs (RISC-V) I hear are designed in a reasonable (non-stateful) way and support setting such flags on each instruction.
So, this will require someone or a small group of people proposing a t-lang project and working out some reasonable solutions here. It might require work on the LLVM side, too. t-opsem / UCG can help figure out the spec for concrete proposals, but we don't have the capacity to push for entirely new language extensions like this ourselves.
I don't think this issue is the right place to discuss the solution space here. I think the original question has been answered (yes, this is UB). The thing that's left before closing the issue is making sure this is properly documented. I am not entirely sure where such docs would go though... somewhere in the reference where we explain the assumptions Rust makes about the surrounding execution environment, but I don't think we have such a place yet?
HadrienG2 commentedon Dec 14, 2024
Thanks for the feedback anyway. I must admit that I'm a bit lost in the communication channels that the Rust project uses. What do you think is the best place to bring this discussion to see if there are enough other interested people ? t-lang at rust-lang zulip ? internals.rust-lang.org ? Somewhere else ?
RalfJung commentedon Dec 14, 2024
I'd start by writing up some pre-RFC draft and circulating it on Zulip and/or IRLO.