-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various fixes that enable running cloud on the JIT #5333
Conversation
On by default for the moment to make use of it for some debugging.
Profiling is on by default in this branch for debugging use
The use of `trace` in racket seems very sensitive to definition and declaration order. It seems like `(trace foo)` must come immediately after `(define foo ...)`.
Directly building closure structures seems to have foiled racket's optimizer, but it's also desirable to avoid expanding code for all definitions. So, this pre-generates currying functions and makes the main definition of each unison function expand into just the currying of the implementation body. Right now a fixed number of procedures is produced. A fail-over to a general procedure in case we have more arguments than that is TODO.
This is the value that Haskell uses. The default value of 4 that we were using causes performance degradation.
The existing approach had a faulty check. It was supposed to avoid loading code that was already in the runtime system. However, it instead avoided loading any code whose dependencies were entirely satisfied by what the runtime system already has. The structure of the old loader was a bit weird for doing the correct thing (due to relative inexperience with scheme at the time), so I rewrote the overall function to be structured a bit better. After fixing that, it became clear that newly loaded code depending on already loaded code was just unimplemented, so I implemented it.
Changed abilities to use symbols for the continuation mark keys. Symbols are canonical, while typelinks aren't, and continuation marks work up to pointer equality. Add groupref builtins. These are additional information beyond termlinks for each function definitions, since a single term can induce multiple scheme procedures. There is a field in closures that holds the groupref associated to the closure, and the currying functions have been rewritten to supply the built closures with the right groupref. This means that functions no longer need to be looked up in a global dictionary, which would have been difficult to make thread safe. Dynamic code loading has been tweaked to fix some issues. Both term and typelink associations are kept track of, and `requires` are generated to import them from modules that have been previously defined. Also, the module name generator has been made thread safe (hopefully), whereas it previously might have had a race condition. The profiling wrapper has been tweaked to try to get better coverage, although it doesn't seem to have changed much.
- Fixed a problem with comparison of functions - Moved termlink->reference to unison/core - Implemented code lookup for functions in compiled code. It was accidentally left out. Some of this is in @unison/internal * implemented decoding of the intermediate code definitions * Reworked code generation so that the generated module has definitions in an order that won't be rejected by racket. * Added code datastructure generation to the intermediate module
In a compiled binary, we'll _know_ the code for all our definitions, but it won't necessarily be loaded into the runtime namespace used for dynamic code loading (due to racket peculiarities). So, we need to make sure we don't assume that knowing the code for a definition means that we have evaluated a module with it defined.
Improve error printing on failure, and rename the term lookup function
Apparently the front-end will sometimes send the same definition more than once, and it's relatively easy to check for duplicates in scheme.
Racket provides a convenient locking construct for the purpose.
Although compiled code is emitted so that the code objects are known, they are not loaded as accessible procedures immediately. This means that if you just expect to have certain code (because it's involved in the program you're actually running), in a compiled executable you may have been wrong. This means that the loading builtins need an extra step to detect dependencies for which code is known, but hasn't been loaded yet. Then the items to be loaded can be augmented with the known code. To make this happen, I've refactored the code loading into multiple procedures. One or two check for actual missing dependencies and signal appropriate results. One crawls the known-but-unloaded code for additional items to load. Finally, one actually loads a module that is complete with respect to a set of code to load. These can be assembled in various ways to implement Code and Value loading. Some of the structures involved have been reworked a bit as well, to be more convenient. Termlinks have been used more systematically when possible, and racket hash maps are used to store code associations rather than lists that we `remove-duplicates` on.
Removed an unused function in core.ss Don't generate termlink decl code in compiled module
Apparently I was wrong, and this just merges cleanly. Amazing. |
Awesome. Is there any specific review you want before merging? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really qualified to review the code changes, but the fact that this makes the Unison Cloud integration tests go from all failing to all passing is a strong indicator that it is a strict improvement. Good work @dolio !
I don't know exactly what code review is appropriate. I know most of you don't really do much scheme, and aren't experts on how the JIT works. Maybe just if someone would look over the code and ask if they see anything that seems strange to them. |
This PR includes significant changes that facilitate running cloud on the JIT. With them the cloud integration tests should pass. There's also some debugging infrastructure that I added while trying to figure out why the integration tests weren't passing.
define-unison
will accept a hint that turns on racket tracing for the specified definition. There's no mechanism for generating such definitions from ucm or the like at the moment, but it lets you easily trace if you're debugging via tweaking generated code.The compiled code generator has support for wrapping the main entry in racket profiling. It uses periodic printing so you can observe the behavior while the program is running. Again, there's no mechanism for enabling this except manual intervention, but it's useful for debugging the JIT right now.
The server socket queue limit has been increased to 2048 to match the Haskell package we use. The racket default is 4, which causes performance problems in scenarios with a lot of connections.
define-unison
has been reworked to use a fixed set of 'curry' functions for enabling partial/overapplication of unison functions. This fixes an optimization problem that was introduced by generating closure structures and using the structure application for the slow path of application, but also avoids generating a lot of code for each function.Several issues with dynamic code loading and lookup have been fixed. It had previously been neglected for standalone executables, but we didn't have anything that really tests that case. I've also added proper synchronization to the code loading to avoid race conditions when multiple threads load things at the same time.
Abilities have been reworked to use symbols as the keys in the continuation. Symbols are 'interned' which makes them work properly with regard to equality, which is necessary for the correct behavior. The previous approach used termlink structures, and arranging for a single, unique termlink to be used so that they are all
eq?
instead of merelyequal?
is a bit much to ask.Some additional minor fixes are included, but I believe the above are the major changes.
I'm anticipating that I'll have to merge with trunk and fix some stuff before this can be merged, since it's pretty far behind at the moment.