Skip to content

Use backpack #8

Open
Open
@dolio

Description

@dolio

This is another more open ended 'idea' issue that I wouldn't expect to get resolved any time soon.

The fundamental idea behind the performance of vector-algorithms is that GHC needs to specialize code for each sort. However, a fundamental barrier to this is that the comparison (or, similar information in the case of radix sorts) is unknown in the core algorithms. This causes multiple possible problems:

  1. Indirect calls to the comparison can be far more expensive than direct primop uses for simple types
  2. The higher-order indirection doesn't allow for worker-wrapper unboxing, adding allocation to the inner loop
  3. The boxing required in 2 could cascade and cause further lack of optimization in the loops

Currently, vector-algorithms uses a sort of sledgehammer approach to combat this: it inlines every function to every call site. This inlining means that the comparison can become statically known, inlined into the sorting function, and the whole thing can be optimized as a unit. However, it also causes massive code blow-up.

There is one case that demonstrates a better possibility. In class-constrained sorts, we do not have to inline, because GHC will specialize code to a particular class instance. So, the sort :: Ord a => ... functions can be compiled independently and referred to with optimal performance. However, even these have problems, because, for example, introsort makes use of heap sort, and the reference happens at too low a level for classes to save us. So, heap sort gets inlined into introsort. And worse, they both use certain optimal small sorts, so those probably get inlined twice.

So, what we need is a means to specialize code to a particular comparison function while maintaining separate compilation. I believe backpack has this functionality. The sorting module would be parameterized by the comparison, so everything therein could refer to it without passing it around. And e.g. the parameterized introsort module could reference the heap sort module with the same parameter, but they could be compiled separately.

This even has the potential to make some intra-module code easier to inspect at the core level. Right now everything gets compiled into one giant function there for similar performance reasons. But in many cases it is probably unnecessary, except for the fact that the comparison has to be passed around as a function argument rather than a module parameter.

Unfortunately, I only know the broad strokes about backpack, so I can't assist much. But if someone with the motivation and know-how volunteers, this would be a fundamental improvement to the structure of the library that I've been trying to find a way to do for years.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions