Skip to content

Compile-Time Performance Regression #37864

@Mark-Simulacrum

Description

@Mark-Simulacrum
Member

#37660 appears to have regressed performance by ~6% on bootstrap, due to a near tripling in time for item-bodies checking (23s to 62s). I'm not sure if that was expected or not, but someone should probably investigate. Let me know if I should open a new issue about that.

See here for a comparison across all crates.

cc @nikomatsakis

Activity

nikomatsakis

nikomatsakis commented on Nov 18, 2016

@nikomatsakis
Contributor

Hmm, yeah, not really expected.

nikomatsakis

nikomatsakis commented on Nov 18, 2016

@nikomatsakis
Contributor

This is the diff that affected rustc_typeck crate:

https://gist.github.com/nikomatsakis/ca47ebbcd264452539074899b6d09355

Not seeing yet what might have caused such a perturbation.

self-assigned this
on Nov 18, 2016
added
I-compiletimeIssue: Problems and improvements with respect to compile times.
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
on Nov 18, 2016
michaelwoerister

michaelwoerister commented on Nov 18, 2016

@michaelwoerister
Member

@eddyb: You also had a look at that PR, any ideas?

Mark-Simulacrum

Mark-Simulacrum commented on Nov 18, 2016

@Mark-Simulacrum
MemberAuthor

Is it possible the PR increased the amount of code in librustc and hit a pathological case in some way? This shows that most crates had little-to-no difference except for librustc itself, which jumped up by ~30 seconds.

michaelwoerister

michaelwoerister commented on Nov 18, 2016

@michaelwoerister
Member

The syntex-syntax test case also shows the regression, so there's definitely something to it.

Mark-Simulacrum

Mark-Simulacrum commented on Nov 18, 2016

@Mark-Simulacrum
MemberAuthor

It looks like I was wrong with my initial assessment about rustc being the only one to show the increase, but this graph shows that it has the largest increase by far out of most rustc crates in that pass.

eddyb

eddyb commented on Nov 19, 2016

@eddyb
Member

I'd suggest comparing with callgrind: if the number of calls related to inference, for example, change, well... My suspicion is basically "impl children get checked twice" but then tests couldn't pass because errors would also be doubled? I'm not sure.

nnethercote

nnethercote commented on Nov 21, 2016

@nnethercote
Contributor

nearest_common_ancestor is at least part of the problem, according to Cachegrind. For example, from syntex:

--------------------------------------------------------------------------------
             Ir
--------------------------------------------------------------------------------
141,932,296,063  PROGRAM TOTALS

--------------------------------------------------------------------------------
           Ir  file:function
--------------------------------------------------------------------------------
6,040,510,628  /home/njn/moz/rust0/src/librustc/middle/region.rs:rustc::middle::region::RegionMaps::nearest_common_ancestor
5,280,761,738  /build/glibc-Qz8a69/glibc-2.23/malloc/malloc.c:_int_malloc
4,376,166,866  /home/njn/moz/rust0/src/librustc/middle/region.rs:rustc::middle::region::RegionMaps::nearest_common_ancestor::ancestors_of
3,449,453,667  /home/njn/moz/rust0/src/libcollections/vec.rs:rustc::middle::region::RegionMaps::nearest_common_ancestor::ancestors_of
3,236,809,213  /build/glibc-Qz8a69/glibc-2.23/malloc/malloc.c:_int_free
1,979,912,076  ???:???
1,959,605,584  /build/glibc-Qz8a69/glibc-2.23/string/../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:__memcpy_avx_unaligned
1,899,088,086  /home/njn/moz/rust0/src/rt/miniz.c:tdefl_compress
1,767,561,366  /build/glibc-Qz8a69/glibc-2.23/malloc/malloc.c:malloc
1,540,380,901  /home/njn/moz/rust0/src/libstd/collections/hash/table.rs:<std::collections::hash::set::HashSet<T, S>>::insert
1,377,680,312  /home/njn/moz/rust0/src/liballoc/raw_vec.rs:rustc::middle::region::RegionMaps::nearest_common_ancestor::ancestors_of
1,144,766,653  /home/njn/moz/rust0/src/libstd/collections/hash/table.rs:<std::collections::hash::set::HashSet<T, S>>::get

The 142M instructions executed is up from 124M. I've not seen nearest_common_ancestor in any profiles prior to today.

If the problem can't be found soon I suggest reverting #37660.

nikomatsakis

nikomatsakis commented on Nov 21, 2016

@nikomatsakis
Contributor

So I did some experimentation with a small test file: it's certainly not as simple as something in typeck happening twice, or at least if it is I didn't figure out what yet.

@nnethercote's samples suggest something about region inference, but we are not running regionck more often, as far as I can tell (nor typeck). Or at least we don't do so on a very simple test case. I will try experimenting with some bigger ones.

If the problem can't be found soon I suggest reverting #37660.

I'm not ready to revert yet. Please consult with me before considering such a thing.

nikomatsakis

nikomatsakis commented on Nov 21, 2016

@nikomatsakis
Contributor

OK, I may have found the culprit.

15 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

I-compiletimeIssue: Problems and improvements with respect to compile times.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @eddyb@nikomatsakis@retep998@Arnavion@michaelwoerister

      Issue actions

        Compile-Time Performance Regression · Issue #37864 · rust-lang/rust