-
Notifications
You must be signed in to change notification settings - Fork 27
run PHD tests in parallel by default #882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
6615594 to
f95fa59
Compare
f95fa59 to
e0d4da0
Compare
|
after being mostly perplexed on Friday afternoon: there are a few wrinkles that show up with this form of concurrency in CI that i wasn't seeing locally. the more confusing one is now, it's not great to download artifacts N times, or extract them potentially N times, but it's weird that the extra rounds of extracting |
if phd-runner decides to run with parallelism > 1, the tmpdir and artifact directories, which phd-runner will write to from each framework, are rewritten to be unique per-framework. this means the directory layout changes based on available parallelism, and makes the `phd-run-with-args.sh` script somewhat more gross, and generally is pretty unfortunate. at this point it's just to demonstrate that something is possible, but this really deserves the requisite surgery to share one log dir and one artifact dir across the Frameworks.
f09ce77 to
589a5ea
Compare
|
589a5ea leaves this in an "it.... works, i guess" place - instead of doing the requisite surgery for especially since i'm doing this only if the inferred parallelism is >1 it's pretty gross. since at this point i only intend to show that we could do something if we were gonna sweat the time, i'm moving this back to draft. in the future i'd love to do the right kind of surgery to just share state between runners here! |
|
the run for |
this adds a
--parallelismflag tophd-runner(and by extension,cargo xtask phd run), which when set tellsphd-runnerwhat the upper bound is for how many tests to run concurrently.if
--parallelismis not passed,phd-runnerwill try to guess a level of parallelism that won't cause tests to fail. this involves a few pretty arbitrary choices that i think are reasonable generally, definitely fine for the environments i know most about (my workstation and lab CI), but if they're wrong will be kind of annoying.MAX_VMS_GUESSassumes that no more than three VMs are running in a test at the same time. i think the actual number here is two. the heuristic also assumes it's safe to use up to a quarter of physical memory if the reservoir is not configured. if these are wrong the worst case is we'll try running too many test VMs concurrently and either have slower tests (reclaiming ARC memory, paging stuff out, ...) or outright fail to allocate for a VM and fail the test for that reason.and of course, if PHD is run with
--omicron-buildand there's no reservoir, that'll fail regardless of parallelism. so really, please set the reservoir to some non-zero size! but havingphd-runnerdo that implies having topfexec cargo xtask phd runand that seems in poor form. so i implore the i'm-sure-studious-reader-of-test-logs to set a reservoir size for tests.on my workstation the PHD tests run with the Alpine image in about 320 seconds. out of the box this infers a max parallelism of 5, wherein the tests take 71 seconds. or with a manual
--parallelism 8, 42 seconds. seems good! i've run with--parallelism 8in a loop for a bit and haven't seen anything untoward.