Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup Ninja backend with many extract_objects or targets #13879

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

bonzini
Copy link
Contributor

@bonzini bonzini commented Nov 6, 2024

This speeds up the target generation of QEMU, which is very slow.

  • Before: 41 seconds (total time 174 seconds)
  • After: 29 21 seconds (total time 150 134 seconds)

get_target_generated_sources often calls File.from_built_relative on
the same file, if it is used by many sources.  This is a somewhat
expensive call both CPU- and memory-wise, so cache the creation
of build-directory files as well.

Signed-off-by: Paolo Bonzini <[email protected]>
The deps and orderdeps are sorted on output, so there is no need to preserve
their order.

Signed-off-by: Paolo Bonzini <[email protected]>
@bonzini bonzini force-pushed the speedups branch 3 times, most recently from 03c53a4 to c7bd527 Compare November 6, 2024 15:37
Accumulate into lists that are passed by the caller, thus avoiding
allocations and calls to extend() on recursive extract_objects().

Signed-off-by: Paolo Bonzini <[email protected]>
The proj_dir_to_build_root argument of determine_ext_objs() is always empty,
remove it.

Signed-off-by: Paolo Bonzini <[email protected]>
Comment on lines 479 to 484
proj_dir_to_build_root: str) -> T.Tuple[T.List[str], T.List[build.BuildTargetTypes]]:
obj_list: T.List[str] = []
deps: T.List[build.BuildTargetTypes] = []
proj_dir_to_build_root: str,
obj_list: T.List[str], deps: T.List[build.BuildTargetTypes]) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcbaker I suspect you may have some opinions about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--verbose? :)

mesonbuild/compilers/compilers.py Outdated Show resolved Hide resolved
Cache the results of is_source(), which is called almost 900000
times in a QEMU setup, and avoid reinventing the
File.from_built_relative() wheel.  Together this basically
removes _determine_ext_objs() from the profile when building QEMU.

While at it, fix the existing wrong mypy annotation on
cached_object_by_name().

Signed-off-by: Paolo Bonzini <[email protected]>
proj_dir_to_build_root is empty by default, in fact always except on
some cases of the VS2010 backend.

Add it after the fact in flatten_object_list(), which reduces the
numbers of os.path.join().

Signed-off-by: Paolo Bonzini <[email protected]>
@bonzini bonzini force-pushed the speedups branch 3 times, most recently from 46cb2e6 to 47d8542 Compare November 6, 2024 17:55
version_compare can take a few milliseconds.  If you have a thousand object files
or a multiple thereof, it adds up.

Signed-off-by: Paolo Bonzini <[email protected]>
@bonzini bonzini changed the title Speedup Ninja backend with many extract_objects Speedup Ninja backend with many extract_objects or targets Nov 7, 2024
Avoid expensive calls and loops, instead relying as much on Python
builtins as possible.  Track whether any options need to be deduped
at flush_pre_post() time, and if not just concatenate pre, _container
and post.

Before:

   ncalls  tottime  cumtime
    19268    0.163    3.586 arglist.py:97(__init__)
    45127    0.251    4.530 arglist.py:142(__iter__)
    81866    3.623    5.013 arglist.py:108(flush_pre_post)
    76618    3.793    5.338 arglist.py:273(__iadd__)

After:

    35647    0.156    0.627 arglist.py:160(__iter__)
    18674    0.211    3.442 arglist.py:97(__init__)
    78998    2.627    3.603 arglist.py:116(flush_pre_post)
    73774    3.605    5.049 arglist.py:292(__iadd__)

Signed-off-by: Paolo Bonzini <[email protected]>
Regexes can be surprisingly slow.  This small change brings
ninja_quote() from 12 to 3 seconds when building QEMU.
Before:

   ncalls  tottime  percall  cumtime  percall
  3734443    4.872    0.000   11.944    0.000

After:

   ncalls  tottime  percall  cumtime  percall
  3595590    3.193    0.000    3.196    0.000

Signed-off-by: Paolo Bonzini <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants