Description
This was found by @RalfJung in #45955.
alloc_system::System::alloc
calls libc::malloc
which is defined as:
extern {
pub fn malloc(size: size_t) -> *mut c_void;
}
When #[global_allocator]
is not used, the current default for executables is alloc_jemalloc
, which links jemalloc as configured in src/liballoc_jemalloc/build.rs
. On most platforms, this is without --with-jemalloc-prefix
, which causes jemalloc to define an unprefixed malloc
symbol that "overrides" libc’s and ends up being used by alloc_system
.
So alloc_system
doesn’t do what the name suggests, in this situation.
Like https://github.com/alexcrichton/jemallocator/issues/19 this problem will disappear when alloc_jemalloc
is eventually removed, but in the meantime alloc_system
doesn’t always do what the name says it does.
We stopped prefixing jemalloc symbols in #31460 in order to make LLVM use them. Could we perhaps only do this when compiling a compiler? Or perhaps the compiler could switch to using the jemallocator
crate, which would gain a Cargo feature flag to disable prefixing? It would do something like that anyway to keep that LLVM+jemalloc benefit when alloc_jemalloc
is removed and the default for executables is changed to alloc_system
.
Activity
SimonSapin commentedon Nov 20, 2017
@alexcrichton wrote #46117 (comment)
Do you mean impossible while also making LLVM use jemalloc? If we revert (relevant parts of) #31460 and go back to always compiling jemalloc with
--with-prefix=something
, wouldn’t that fix this issue? What’s the plan for LLVM once thealloc_jemalloc
crate is removed?sfackler commentedon Nov 20, 2017
Prefixing jemalloc symbols would resolve this issue, but I would argue that would be a net negative. It's not great to have two competing allocators both running at the same time in a single process.
As long as alloc_jemalloc/jemallocator/whatever is configured to have non-prefixed symbols, LLVM will continue to use it AFAIK.
alexcrichton commentedon Nov 21, 2017
@SimonSapin ah yeah sorry what I meant was we could indeed prefix the symbols but the intention is to get LLVM to use jemalloc (as it makes it ~10% faster historically). I think our long term plan is to remove jemalloc from libtsd but leave it in the compiler, so rustc itself will still use jemalloc but Rust programs by default will not.
sfackler commentedon Nov 21, 2017
If you really want to talk to specifically glibc malloc, you can link to
__libc_malloc
, but that's probably not a portable thing.SimonSapin commentedon Nov 21, 2017
@alexcrichton Right, I think that split between rustc and std is probably the best eventual outcome.
@sfackler Interesting. I think we still want
alloc_system
to call plainmalloc
, though. For example Firefox redefinesmalloc
(in a fork of an old version of jemalloc) and expects Rust dynamic libraries to use it.RalfJung commentedon Nov 21, 2017
I can see that point. But then we should not be providing APIs that pretend to do this, while they actually do not.
SimonSapin commentedon Nov 21, 2017
TL;DR: let’s switch rustc to unprefixed jemallocator and restore symbol prefixes in alloc_jemalloc now?
So, we’re discussing a number of desirable but apparently competing points. I think we can have our cake and eat it too. I’m gonna name them to untangle everything without repeating lengthy phrases over and over.
std::heap::Heap
in Rust executables use jemallocstd::heap::Heap
in Rust executables use the system allocatormalloc
linked with a Rust program end up using the same allocator asstd::heap::Heap
std::heap::System
uses the system allocator even if not selected forstd::heap::Heap
with#[global_allocator]
: fixing this bugCurrently we have D and E on some platforms since
alloc_jemalloc
configures jemalloc without a symbol prefix, and A since it’s the default.There’s a number of changes we can make. These are not blocked as far as I can tell:
alloc_jemalloc
always configure jemalloc with a symbol prefix.alloc_jemalloc
and make B the defaultunprefixed
Cargo feature to https://crates.io/crates/jemalloc-sys (and perhaps forward it through https://crates.io/crates/jemallocator) so that it configures jemalloc without a symbol prefix. (The default would still be--with-prefix=_rjem_
.)jemallocator
with 3 instead ofalloc_jemalloc
This is blocked on API design decisions:
#[gloabal_allocator]
Tracking issue for changing the global, default allocator (RFC 1974) #27389 andAlloc
Allocator traits and std::heap #32838.With 1 alone we gain F, but lose D and E.
With 2 alone we gain B and F, but lose A, D, and E. Which of A or B is more desirable (if we have to choose) is debatable. Adding 5 restores A.
#33082 (comment) suggests that doing 2 is planned, but not until until 5 is solved (presumably to avoid losing A). At that point we’ll likely also want to avoid losing D (or E). Doing 3 and 4 seems to be to be the easiest way to achieve that.
However we don’t need to wait for 5 to be stabilized before doing 3 or 4. So I suggest doing 3, 4, then 1 now-ish. Both in order to fix F, and to be ready to do 2 later when 5 is unblocked. The only point we would temporarily lose is E for stable users.
@alexcrichton, @sfackler, what do you think?
alexcrichton commentedon Nov 21, 2017
Seems plausible to me!
I'm remembering now though that this probably won't work unfortunately. The standard library is created as a dynamic library which currently fixes the allocator to jemalloc (
alloc_jemalloc
that is). That will be required to get fixed first before we can have alloc_jemalloc and jemallocatorSimonSapin commentedon Nov 21, 2017
Any mention of "use jemallocator" above implies selecting it with
#[global_allocator]
. My understanding is thatalloc_jemalloc
and its copy of jemalloc are only linked when#[global_allocator]
is not used. Otherwise jemallocator could never be used at all.alexcrichton commentedon Nov 21, 2017
Er yes I think I understand what you're advocating for, and it sounds like a great plan. What I mean is that if you do it you'll get a compile time error and it will fail to compile.
SimonSapin commentedon Nov 21, 2017
I don’t understand what error that would be.
8 remaining items