-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Initial implementation of parallel type checking #20280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
JukkaL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks of working on this -- parallel processing has a huge potential since every CPU has multiple cores, and the core counts only seem to keep increasing year after year. Not a full review, but left some minor comments.
mypy/build.py
Outdated
| for worker in manager.workers: | ||
| data = receive(worker.conn) | ||
| assert data["status"] == "ok" | ||
| send(worker.conn, {"sccs": [(list(scc.mod_ids), scc.id, list(scc.deps)) for scc in sccs]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Precompute the data outside the loops, since it's the same for each worked.
mypy/main.py
Outdated
| internals_group.add_argument("--export-ref-info", action="store_true", help=argparse.SUPPRESS) | ||
|
|
||
| # Experimental parallel type-checking support. | ||
| internals_group.add_argument("--num-workers", type=int, default=0, help=argparse.SUPPRESS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about allowing also -n for this, similar to pytest?
mypy/defaults.py
Outdated
|
|
||
| RECURSION_LIMIT: Final = 2**14 | ||
|
|
||
| WORKER_START_INTERVAL: Final = 0.03 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
30ms can be a large fraction of Python process startup time. It might be a bit more efficient to have this as 10ms, for example, to speed up small builds a little.
mypy/build_worker/worker.py
Outdated
| * Load graph using the sources, and send "ok" to coordinator. | ||
| * Receive SCC structure from coordinator, and ack it with an "ok". | ||
| * Receive an SCC id from coordinator, process it, and send back the results. | ||
| * When prompted by coordinator (with s "final" message), cleanup and shutdown. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's s "final?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a typo, should be a "final" :-)
| python: '3.14' | ||
| os: ubuntu-24.04-arm | ||
| toxenv: py | ||
| tox_extra_args: "-n 4 --mypy-num-workers=4 mypy/test/testcheck.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to add some test cases that are specifically designed to test parallel type checking, e.g. a long import chain, or potential for large number of parallelism (no need to do this in this PR)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was thinking about this. I will add this to the list of follow-up items in PR description so that I will not forget about it.
|
|
||
| workers = [] | ||
| if options.num_workers > 0: | ||
| pickled_options = pickle.dumps(options.snapshot()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Later on, we may want to use something more efficient than pickle (but it's fine for now). Maybe add a TODO comment about it?
mypy/build.py
Outdated
| for worker in workers: | ||
| # Start loading graph in each worker as soon as it is up. | ||
| worker.connect() | ||
| source_tuples = [(s.path, s.module, s.text, s.base_dir, s.followed) for s in sources] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calculate the list outside the loop, since it's the same for each worker.
|
@JukkaL I addressed your comments. Please let me know if you want to take a look again before this is merged. |
This comment has been minimized.
This comment has been minimized.
|
I want to have another look and try this out a little before merging, probably by Tue/Wed this week. |
|
I tried it on a huge codebase at work, on macOS, and encountered this crash: This is probably related to using sqlite for the cache. More complete output: |
|
Oh yes, using sqlite cahce may be tricky, multiple processes can't probably write at the same time. I will check what is the standard workaround for this (maybe just a retry). |
This comment has been minimized.
This comment has been minimized.
|
@JukkaL I think |
|
Thanks! I will test using |
This comment has been minimized.
This comment has been minimized.
|
@JukkaL if you don't have any "large-scale" comments, I would prefer to merge this soon, and fix smaller things in follow up PRs incrementally (this is hidden behind a flag anyway). Otherwise it will just gather dust and merge conflicts. |
|
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅ |
Fixes #933
This is not very polished, but a fully functional implementation. This gives ~1.5x performance improvement for self-check. I think we can keep this feature hidden while we iterate on it. A very high-level overview is to start
nworkers, each of which loads the graph, the coordinator process then submits SCCs one by one as they become unblocked by dependencies. Workers use regular cache to get information about SCCs processed by other workers. There are more details in the docstring forworker.py.Some notes:
def ready_to_read(conns: list[IPCClient]) -> list[int].__all__foo defined herenotes, see "<function> defined here" notes omitted when function is loaded from cache #4772mypy/ipc.pyto using tolibrt.base64. This may be not critical now, but will be important with the new parser, when we will be sending larger chunks of data over the sockets.I am going to address some of the above issues, and re-enable tests gradually in follow-up PRs. More long term there are three main areas for further improvements: