Skip to content

liballoc_system without #[global_allocator] uses jemalloc #45966

Closed
@SimonSapin

Description

@SimonSapin
Contributor

This was found by @RalfJung in #45955.

alloc_system::System::alloc calls libc::malloc which is defined as:

extern {
    pub fn malloc(size: size_t) -> *mut c_void;
}

When #[global_allocator] is not used, the current default for executables is alloc_jemalloc, which links jemalloc as configured in src/liballoc_jemalloc/build.rs. On most platforms, this is without --with-jemalloc-prefix, which causes jemalloc to define an unprefixed malloc symbol that "overrides" libc’s and ends up being used by alloc_system.

So alloc_system doesn’t do what the name suggests, in this situation.

Like https://github.com/alexcrichton/jemallocator/issues/19 this problem will disappear when alloc_jemalloc is eventually removed, but in the meantime alloc_system doesn’t always do what the name says it does.

We stopped prefixing jemalloc symbols in #31460 in order to make LLVM use them. Could we perhaps only do this when compiling a compiler? Or perhaps the compiler could switch to using the jemallocator crate, which would gain a Cargo feature flag to disable prefixing? It would do something like that anyway to keep that LLVM+jemalloc benefit when alloc_jemalloc is removed and the default for executables is changed to alloc_system.

CC @alexcrichton

Activity

added
C-enhancementCategory: An issue proposing an enhancement or a PR with one.
on Nov 14, 2017
SimonSapin

SimonSapin commented on Nov 20, 2017

@SimonSapin
ContributorAuthor

@alexcrichton wrote #46117 (comment)

Unfortunately I think it's basically just impossible to fix System.alloc calling jemalloc because of linker trickery. Our only recourse is to remove jemalloc.

Do you mean impossible while also making LLVM use jemalloc? If we revert (relevant parts of) #31460 and go back to always compiling jemalloc with --with-prefix=something, wouldn’t that fix this issue? What’s the plan for LLVM once the alloc_jemalloc crate is removed?

sfackler

sfackler commented on Nov 20, 2017

@sfackler
Member

Prefixing jemalloc symbols would resolve this issue, but I would argue that would be a net negative. It's not great to have two competing allocators both running at the same time in a single process.

As long as alloc_jemalloc/jemallocator/whatever is configured to have non-prefixed symbols, LLVM will continue to use it AFAIK.

alexcrichton

alexcrichton commented on Nov 21, 2017

@alexcrichton
Member

@SimonSapin ah yeah sorry what I meant was we could indeed prefix the symbols but the intention is to get LLVM to use jemalloc (as it makes it ~10% faster historically). I think our long term plan is to remove jemalloc from libtsd but leave it in the compiler, so rustc itself will still use jemalloc but Rust programs by default will not.

sfackler

sfackler commented on Nov 21, 2017

@sfackler
Member

If you really want to talk to specifically glibc malloc, you can link to __libc_malloc, but that's probably not a portable thing.

SimonSapin

SimonSapin commented on Nov 21, 2017

@SimonSapin
ContributorAuthor

@alexcrichton Right, I think that split between rustc and std is probably the best eventual outcome.

@sfackler Interesting. I think we still want alloc_system to call plain malloc, though. For example Firefox redefines malloc (in a fork of an old version of jemalloc) and expects Rust dynamic libraries to use it.

RalfJung

RalfJung commented on Nov 21, 2017

@RalfJung
Member

It's not great to have two competing allocators both running at the same time in a single process.

I can see that point. But then we should not be providing APIs that pretend to do this, while they actually do not.

SimonSapin

SimonSapin commented on Nov 21, 2017

@SimonSapin
ContributorAuthor

TL;DR: let’s switch rustc to unprefixed jemallocator and restore symbol prefixes in alloc_jemalloc now?


So, we’re discussing a number of desirable but apparently competing points. I think we can have our cake and eat it too. I’m gonna name them to untangle everything without repeating lengthy phrases over and over.

  • A. Stable users can choose to have std::heap::Heap in Rust executables use jemalloc
  • B. Stable users can choose to have std::heap::Heap in Rust executables use the system allocator
  • D. Rustc and LLVM-in-rustc use jemalloc (for that 10% perf improvement)
  • E. C/C++ libraries that uses malloc linked with a Rust program end up using the same allocator as std::heap::Heap
  • F. std::heap::System uses the system allocator even if not selected for std::heap::Heap with #[global_allocator]: fixing this bug

Currently we have D and E on some platforms since alloc_jemalloc configures jemalloc without a symbol prefix, and A since it’s the default.

There’s a number of changes we can make. These are not blocked as far as I can tell:

  • 1. Make alloc_jemalloc always configure jemalloc with a symbol prefix.
  • 2. Remove alloc_jemalloc and make B the default
  • 3. Add an unprefixed Cargo feature to https://crates.io/crates/jemalloc-sys (and perhaps forward it through https://crates.io/crates/jemallocator) so that it configures jemalloc without a symbol prefix. (The default would still be --with-prefix=_rjem_.)
  • 4. Make rustc use jemallocator with 3 instead of alloc_jemalloc

This is blocked on API design decisions:

With 1 alone we gain F, but lose D and E.

With 2 alone we gain B and F, but lose A, D, and E. Which of A or B is more desirable (if we have to choose) is debatable. Adding 5 restores A.

#33082 (comment) suggests that doing 2 is planned, but not until until 5 is solved (presumably to avoid losing A). At that point we’ll likely also want to avoid losing D (or E). Doing 3 and 4 seems to be to be the easiest way to achieve that.

However we don’t need to wait for 5 to be stabilized before doing 3 or 4. So I suggest doing 3, 4, then 1 now-ish. Both in order to fix F, and to be ready to do 2 later when 5 is unblocked. The only point we would temporarily lose is E for stable users.

@alexcrichton, @sfackler, what do you think?

alexcrichton

alexcrichton commented on Nov 21, 2017

@alexcrichton
Member

Seems plausible to me!

I'm remembering now though that this probably won't work unfortunately. The standard library is created as a dynamic library which currently fixes the allocator to jemalloc (alloc_jemalloc that is). That will be required to get fixed first before we can have alloc_jemalloc and jemallocator

SimonSapin

SimonSapin commented on Nov 21, 2017

@SimonSapin
ContributorAuthor

Any mention of "use jemallocator" above implies selecting it with #[global_allocator]. My understanding is that alloc_jemalloc and its copy of jemalloc are only linked when #[global_allocator] is not used. Otherwise jemallocator could never be used at all.

alexcrichton

alexcrichton commented on Nov 21, 2017

@alexcrichton
Member

Er yes I think I understand what you're advocating for, and it sounds like a great plan. What I mean is that if you do it you'll get a compile time error and it will fail to compile.

SimonSapin

SimonSapin commented on Nov 21, 2017

@SimonSapin
ContributorAuthor

I don’t understand what error that would be.

8 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementCategory: An issue proposing an enhancement or a PR with one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @alexcrichton@SimonSapin@RalfJung@TimNN@sfackler

        Issue actions

          liballoc_system without #[global_allocator] uses jemalloc · Issue #45966 · rust-lang/rust