Skip to content

liballoc_system without #[global_allocator] uses jemalloc #45966

Closed
@SimonSapin

Description

@SimonSapin
Contributor

This was found by @RalfJung in #45955.

alloc_system::System::alloc calls libc::malloc which is defined as:

extern {
    pub fn malloc(size: size_t) -> *mut c_void;
}

When #[global_allocator] is not used, the current default for executables is alloc_jemalloc, which links jemalloc as configured in src/liballoc_jemalloc/build.rs. On most platforms, this is without --with-jemalloc-prefix, which causes jemalloc to define an unprefixed malloc symbol that "overrides" libc’s and ends up being used by alloc_system.

So alloc_system doesn’t do what the name suggests, in this situation.

Like https://github.com/alexcrichton/jemallocator/issues/19 this problem will disappear when alloc_jemalloc is eventually removed, but in the meantime alloc_system doesn’t always do what the name says it does.

We stopped prefixing jemalloc symbols in #31460 in order to make LLVM use them. Could we perhaps only do this when compiling a compiler? Or perhaps the compiler could switch to using the jemallocator crate, which would gain a Cargo feature flag to disable prefixing? It would do something like that anyway to keep that LLVM+jemalloc benefit when alloc_jemalloc is removed and the default for executables is changed to alloc_system.

CC @alexcrichton

Activity

added
C-enhancementCategory: An issue proposing an enhancement or a PR with one.
on Nov 14, 2017
SimonSapin

SimonSapin commented on Nov 20, 2017

@SimonSapin
ContributorAuthor

@alexcrichton wrote #46117 (comment)

Unfortunately I think it's basically just impossible to fix System.alloc calling jemalloc because of linker trickery. Our only recourse is to remove jemalloc.

Do you mean impossible while also making LLVM use jemalloc? If we revert (relevant parts of) #31460 and go back to always compiling jemalloc with --with-prefix=something, wouldn’t that fix this issue? What’s the plan for LLVM once the alloc_jemalloc crate is removed?

sfackler

sfackler commented on Nov 20, 2017

@sfackler
Member

Prefixing jemalloc symbols would resolve this issue, but I would argue that would be a net negative. It's not great to have two competing allocators both running at the same time in a single process.

As long as alloc_jemalloc/jemallocator/whatever is configured to have non-prefixed symbols, LLVM will continue to use it AFAIK.

alexcrichton

alexcrichton commented on Nov 21, 2017

@alexcrichton
Member

@SimonSapin ah yeah sorry what I meant was we could indeed prefix the symbols but the intention is to get LLVM to use jemalloc (as it makes it ~10% faster historically). I think our long term plan is to remove jemalloc from libtsd but leave it in the compiler, so rustc itself will still use jemalloc but Rust programs by default will not.

sfackler

sfackler commented on Nov 21, 2017

@sfackler
Member

If you really want to talk to specifically glibc malloc, you can link to __libc_malloc, but that's probably not a portable thing.

SimonSapin

SimonSapin commented on Nov 21, 2017

@SimonSapin
ContributorAuthor

@alexcrichton Right, I think that split between rustc and std is probably the best eventual outcome.

@sfackler Interesting. I think we still want alloc_system to call plain malloc, though. For example Firefox redefines malloc (in a fork of an old version of jemalloc) and expects Rust dynamic libraries to use it.

RalfJung

RalfJung commented on Nov 21, 2017

@RalfJung
Member

It's not great to have two competing allocators both running at the same time in a single process.

I can see that point. But then we should not be providing APIs that pretend to do this, while they actually do not.

SimonSapin

SimonSapin commented on Nov 21, 2017

@SimonSapin
ContributorAuthor

TL;DR: let’s switch rustc to unprefixed jemallocator and restore symbol prefixes in alloc_jemalloc now?


So, we’re discussing a number of desirable but apparently competing points. I think we can have our cake and eat it too. I’m gonna name them to untangle everything without repeating lengthy phrases over and over.

  • A. Stable users can choose to have std::heap::Heap in Rust executables use jemalloc
  • B. Stable users can choose to have std::heap::Heap in Rust executables use the system allocator
  • D. Rustc and LLVM-in-rustc use jemalloc (for that 10% perf improvement)
  • E. C/C++ libraries that uses malloc linked with a Rust program end up using the same allocator as std::heap::Heap
  • F. std::heap::System uses the system allocator even if not selected for std::heap::Heap with #[global_allocator]: fixing this bug

Currently we have D and E on some platforms since alloc_jemalloc configures jemalloc without a symbol prefix, and A since it’s the default.

There’s a number of changes we can make. These are not blocked as far as I can tell:

  • 1. Make alloc_jemalloc always configure jemalloc with a symbol prefix.
  • 2. Remove alloc_jemalloc and make B the default
  • 3. Add an unprefixed Cargo feature to https://crates.io/crates/jemalloc-sys (and perhaps forward it through https://crates.io/crates/jemallocator) so that it configures jemalloc without a symbol prefix. (The default would still be --with-prefix=_rjem_.)
  • 4. Make rustc use jemallocator with 3 instead of alloc_jemalloc

This is blocked on API design decisions:

With 1 alone we gain F, but lose D and E.

With 2 alone we gain B and F, but lose A, D, and E. Which of A or B is more desirable (if we have to choose) is debatable. Adding 5 restores A.

#33082 (comment) suggests that doing 2 is planned, but not until until 5 is solved (presumably to avoid losing A). At that point we’ll likely also want to avoid losing D (or E). Doing 3 and 4 seems to be to be the easiest way to achieve that.

However we don’t need to wait for 5 to be stabilized before doing 3 or 4. So I suggest doing 3, 4, then 1 now-ish. Both in order to fix F, and to be ready to do 2 later when 5 is unblocked. The only point we would temporarily lose is E for stable users.

@alexcrichton, @sfackler, what do you think?

alexcrichton

alexcrichton commented on Nov 21, 2017

@alexcrichton
Member

Seems plausible to me!

I'm remembering now though that this probably won't work unfortunately. The standard library is created as a dynamic library which currently fixes the allocator to jemalloc (alloc_jemalloc that is). That will be required to get fixed first before we can have alloc_jemalloc and jemallocator

SimonSapin

SimonSapin commented on Nov 21, 2017

@SimonSapin
ContributorAuthor

Any mention of "use jemallocator" above implies selecting it with #[global_allocator]. My understanding is that alloc_jemalloc and its copy of jemalloc are only linked when #[global_allocator] is not used. Otherwise jemallocator could never be used at all.

alexcrichton

alexcrichton commented on Nov 21, 2017

@alexcrichton
Member

Er yes I think I understand what you're advocating for, and it sounds like a great plan. What I mean is that if you do it you'll get a compile time error and it will fail to compile.

SimonSapin

SimonSapin commented on Nov 21, 2017

@SimonSapin
ContributorAuthor

I don’t understand what error that would be.

alexcrichton

alexcrichton commented on Nov 21, 2017

@alexcrichton
Member

Er yes that's what I mean by "fixing the allocator" and "required to get fixed", it's a bit obscure... Turns out it's not a compile time error (but it should be right now) but you can observe the behavior with:

#![feature(global_allocator, allocator_api)]

use std::heap::*;

#[global_allocator]
static A: B = B;

struct B;

static mut HIT: bool = false;

unsafe impl<'a> Alloc for &'a B {
    unsafe fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr> {
        HIT = true;
        System.alloc(layout)
    }
    unsafe fn dealloc(&mut self, ptr: *mut u8, layout: Layout) {
        HIT = true;
        System.dealloc(ptr, layout)
    }
}

fn main() {
    println!("hello!");
    assert!(unsafe { HIT });
}

That program "accidentally" succeeds on Linux due to how the dynamic linker works but it fails on Windows where the dynamic linker works differently.

alexcrichton

alexcrichton commented on Nov 21, 2017

@alexcrichton
Member

Er, when compiling that program with -C prefer-dynamic, that is, like how the compiler is compiled.

alexcrichton

alexcrichton commented on Oct 20, 2018

@alexcrichton
Member

This is a pretty old issue now and I think is something we're probably not going to change to preserve performance of rustc, so I'm going to close this in favor of #36963 where we'll intentionally be routing malloc through jemalloc in the compiler specifically because we want LLVM to use jemalloc

RalfJung

RalfJung commented on Oct 20, 2018

@RalfJung
Member

@alexcrichton Just to be clear, it would still be wrong if outside the compiler, calling System::alloc would invoke jemmalloc, right? You just consider fixing that behavior part of #36963 now?

alexcrichton

alexcrichton commented on Oct 20, 2018

@alexcrichton
Member

@RalfJung it's not really wrong per se in the sense that this is how jemalloc/Linux are designed. The "system allocator" on Linux isn't glibc malloc but rather the malloc/free symbols. It just so happens that jemalloc's build by default overrides those symbols.

This is also intentional behavior that we desired on OSX/Linux because we want LLVM's usage of malloc and free to get routed to one global allocator, jemalloc, which is shared with rustc

SimonSapin

SimonSapin commented on Oct 21, 2018

@SimonSapin
ContributorAuthor

It’s not wrong for System::alloc to use jemalloc is configured to hook itself into the system allocator. However it’s definitely unexpected IMO that Rust does so by default. #55238 should fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementCategory: An issue proposing an enhancement or a PR with one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @alexcrichton@SimonSapin@RalfJung@TimNN@sfackler

        Issue actions

          liballoc_system without #[global_allocator] uses jemalloc · Issue #45966 · rust-lang/rust