Skip to content

Tracking issue for sorting by expensive-to-compute keys (feature slice_sort_by_cached_key) #34447

Closed
@dato

Description

@dato
Contributor

Hi—

Ideally, the implementation of sort_by_key() would invoke the received key function exactly once per item. This is highly beneficial when a costly computation (e.g., a distance function) needs to be used for sorting.

But Rust’s implementation as of 1.9 (link) calls the key function each time:

 pub fn sort_by_key(&mut self, mut f: F)
{
    self.sort_by(|a, b| f(a).cmp(&f(b)))
}

For comparison, on its day Python highlighted this in the release notes for 2.4. As per their current Sorting HOW TO:

The value of the key parameter should be a function that takes a single argument and returns a key to use for sorting purposes. This technique is fast because the key function is called exactly once for each input record.

Many thanks for considering. It’d be great if Rust could behave the same way.

Activity

Thiez

Thiez commented on Jun 24, 2016

@Thiez
Contributor

I suspect the method works this way to avoid allocation. Most people would expect this method not to allocate, and to sort in place. If you want to calculate keys only once, perhaps you could introduce some kind of caching inside f. This should be possible because sort_by_key takes a FnMut.

hanna-kruppe

hanna-kruppe commented on Jun 24, 2016

@hanna-kruppe
Contributor

@Thiez [T]::sort_by uses merge sort and thus already allocates.

Thiez

Thiez commented on Jun 24, 2016

@Thiez
Contributor

I stand corrected :-)

ExpHP

ExpHP commented on Jun 24, 2016

@ExpHP
Contributor

I imagine that the current method may still be more optimal for simple key functions like |obj| obj.some_member which lend themselves well to further optimization.

This is a marked difference from Python's case, where even the simplest key function still has overhead (making a Schwartzian transform-style sort the clear winner).

DemiMarie

DemiMarie commented on Jun 25, 2016

@DemiMarie
Contributor

I think @ExpHP is correct: CPython's string sort, which is written in C, is much faster than any key function written in Python. Note that this is not necessarily true for PyPy (which has a JIT), and is definitely not true in Rust.

arthurprs

arthurprs commented on Jul 1, 2016

@arthurprs
Contributor

The lambda is just passed down as a comparator, you'd be surprised by how much the optimizer can do with that. The name is misleading but this is not the key argument python in python sort(ed) this would actually be the cmp argument

The behavior you suggest has it's uses but it's probably something that belong to an external crate.

ExpHP

ExpHP commented on Jul 1, 2016

@ExpHP
Contributor

The name is misleading but this is not the key argument python in python sort(ed) this would actually be the cmp argument

To play devil's advocate a bit, this is not entirely a fair assessment; rust already provides the capability of cmp via sort_by. So to me it does not seem unreasonable for some to expect that sort_by_key might be more than just a convenience method.

Actually, for that reason, I was surprised to learn that a sort_by_key method had been added in the first place! As somebody coming from Python, prior to 1.7.0 I was always frustrated by having to write things like |a,b| a.member.cmp(&b.member), and often dearly wished for a convenience method like the sort_by_key that exists today.

Then one day while converting some Python code I came across a sort with an expensive key method, and suddenly it all made sense. There are two different idioms to sorting a list by a key, each suited to different use cases. At the time, I concluded that this must have been the reason why rust had no sort_by_key.

added
T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.
and removed on Mar 24, 2017
added
C-enhancementCategory: An issue proposing an enhancement or a PR with one.
I-slowIssue: Problems and improvements with respect to performance of generated code.
on Jul 25, 2017
added a commit that references this issue on Mar 27, 2018

Rollup merge of rust-lang#48639 - varkor:sort_by_key-cached, r=bluss

42de36d

22 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementCategory: An issue proposing an enhancement or a PR with one.I-slowIssue: Problems and improvements with respect to performance of generated code.T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @steveklabnik@dato@Thiez@arthurprs@ExpHP

      Issue actions

        Tracking issue for sorting by expensive-to-compute keys (feature slice_sort_by_cached_key) · Issue #34447 · rust-lang/rust