Inline code for function calls in interpreter #5330

ChrisPenner · 2024-09-05T16:31:33Z

Overview

Previously we stored only references to the code we wanted to jump to in things like function calls/applications even though the code each reference referred to was known at the time we translate to MCode.

This change statically resolves each reference to the code it refers to at code generation time, avoiding lookups at runtime.

Implementation notes

Parameterizes most of the MCode types over their comb type such that we can build the computations using CombIx, then "tie the knot" and resolve to Combs using a fixed point.
Add code to actually do the resolution in the appropriate spots (code-gen time, deserialization time)
Also downgrades parser-typechecker to -O1 by Dan's suggestion, O2 is pretty intense (and slow) and can probably be sequestered off to just the runtime package now that it's split off to improve build times. Reverted by Paul's request

Interesting/controversial decisions

Test coverage

Existing tests should cover it.

Benchmarks:

@pchiusano/misc-benchmarks/chris:.suite

this-branch:
Decode Nat
748ns

Generate 100 random numbers
470.88µs

Count to 1 million
449.3574ms

Json parsing (per document)
377.202µs

Count to N (per element)
533ns

Count to 1000
547.879µs

Mutate a Ref 1000 times
1.011772ms

CAS an IO.ref 1000 times
1.259371ms

List.range (per element)
651ns

List.range 0 1000
666.46µs

Set.fromList (range 0 1000)
3.05485ms

Map.fromList (range 0 1000)
2.314902ms

Map.lookup (1k element map)
5.877µs

Map.insert (1k element map)
15.034µs

List.at (1k element list)
580ns

Text.split /
40.449µs

----------

trunk:
Decode Nat
947ns

Generate 100 random numbers
623.831µs

Count to 1 million
574.758ms

Json parsing (per document)
461.911µs

Count to N (per element)
690ns

Count to 1000
691.485µs

Mutate a Ref 1000 times
1.288178ms

CAS an IO.ref 1000 times
1.589608ms

List.range (per element)
875ns

List.range 0 1000
879.3µs

Set.fromList (range 0 1000)
3.849484ms

Map.fromList (range 0 1000)
2.987285ms

Map.lookup (1k element map)
7.849µs

Map.insert (1k element map)
19.599µs

List.at (1k element list)
768ns

Text.split /
52.997µs

@mitchellwrosen/mbta/@mitchellwrosen/branch-for-chris:.runTheBenchmark

this-branch:
read bytes: 0.201ms
utf8 decode byes: 0.165ms
json decode text: 387.17ms
jsonapi parse json: 126.466ms
mbta parse jsonapi: 139.128ms

trunk:
read bytes: 0.188ms
utf8 decode byes: 0.166ms
json decode text: 490.019ms
jsonapi parse json: 170.002ms
mbta parse jsonapi: 182.414ms

Loose ends

Nope.

Comments

pchiusano · 2024-09-05T22:02:43Z

Looking pretty good so far! Though less dramatic than I hoped. I'll be curious to see it with the same treatment applied to foreign function calls (not sure if you're planning separate PR for that), I'm guessing that will make the JSON decoding and other benchmarks that use a lot of foreign calls go faster.

Note that trunk is going to be worse the more code you have loaded, since the IntMap will have greater depth. A best case scenario for trunk is running benchmarks after freshly starting up UCM (or you can use compile command with ucm run.compiled).

But this isn't terribly realistic since (for instance) Unison Cloud nodes are kept running and are loading new code all the time. And most users aren't bouncing UCM willy nilly.

…st sense conceptually

ChrisPenner · 2024-09-06T17:41:11Z

unison-runtime/src/Unison/Runtime/Machine.hs

  | Just n <- M.lookup exceptionRef rfTy,
+    -- TODO: Should I special-case this raise ref and pass it down from the top rather than always looking it up?


cc @dolio What do you think about this particular case?

I doubt special casing this will make much difference anywhere. This is essentially just the top level handler for exceptions, which will kill the interpreter with an error message when it's invoked.

pchiusano

Nice progress, I'll let @dolio weigh in. I'm assuming you're going to do the analogous ForeignCall optimization as a separate PR?

One note: I think releases should still build everything with -O2 though, including parser-typechecker. We aren't really sure how much it matters there, but we also don't have a good way of testing it, so I'm inclined to leave it alone.

If you want to switch up the release process to support building with different flags than the default then that's cool, but I don't think it does that currently. Or you could do this in a separate PR.

dolio · 2024-09-06T20:02:16Z

Reading through some of this, something occurred to me. Is there even a reason for us to number the term references at this point?

I guess it's a little more economical to make numbers for the references and store the latter once for e.g. compiled data. But for actual code functionality, we just make up numbers for the references, then get rid of the numbers by making things circular, right?

dolio · 2024-09-09T14:42:18Z

unison-runtime/src/Unison/Runtime/MCode.hs

+resolveCombs mayExisting combs =
+  -- Fixed point lookup;
+  -- We make sure not to force resolved Combs or we'll loop forever.
+  let ~resolved =


This ~ doesn't actually do anything. let is irrefutable by default unless perhaps the Strict extension is enabled, which I wouldn't recommend, because what it actually does can be rather confusing.

I can remove it if you like, I'm mostly using it as a sign-post since Haskell doesn't have explicit let-recs 😢

dolio

Okay, I've read through this. I don't see any major issues. It looks about like what I was expecting.

I left a couple comments about minor things, but that's it.

ChrisPenner · 2024-09-09T16:57:39Z

And yes, I'll do the Foreign Funcs as a separate PR so we can track each performance change individually.

ChrisPenner · 2024-09-09T17:25:17Z

@dolio

Is there even a reason for us to number the term references at this point?
I guess it's a little more economical to make numbers for the references and store the latter once for e.g. compiled data. But for actual code functionality, we just make up numbers for the references, then get rid of the numbers by making things circular, right?

Yeah I think you're right, though tbh it's probably more work to remove them than it's worth at this point, I don't think they're doing any harm 🤷🏼‍♂️

dolio · 2024-09-09T17:53:09Z

Yeah, I wouldn't try to remove the numbering I guess.

Actually, the CI failures look suspicious. The mac one reports a stack overflow on the codeops test, so something there might be doing something bad with a circular representation.

ChrisPenner · 2024-09-09T18:21:45Z

There's an infinite loop when testing equality on recursive functions; I'm guessing we're naively crawling through and checking equality on all contained sections; probably just need to lower back to combix's when checking equality. Don't worry, we won't merge until CI is passing 😄

Failure case:

testcase = do
  f n = if n == 0 then 0 else f (Nat.drop 1 n)
  f == f

dolio · 2024-09-09T18:35:38Z

Oh, right.

Actually, you should just not derive Eq and Ord for RComb. It should test equality via the stored CombIx, no?

For that matter, I guess the Show instance for RComb could be more informative using the CombIx, too, right?

ChrisPenner · 2024-09-09T18:45:37Z

Haha, yeah I had the exact same ideas 😄

ChrisPenner · 2024-09-10T19:09:50Z

We've passed the nimbus test-suite now as well ✅

ChrisPenner and others added 12 commits September 3, 2024 15:20

Downgrade parser-typechecker to just O1

2e185b0

Add RComb knot-tying code

485de75

Add serializers for RComb

33576ca

WIP replacing combix with combs

5333287

More WIP

11dde8e

Propagate more RCombs, start parameterizing Ref

813ba9a

Add new serializations

8ac174f

Finish replacing combs in Ref

3bca08c

Propagate RClosure

fe7e36d

automatically run ormolu

33508bd

Propagate new interface outwards

8bd04c5

Remove redundant ref in Lam

0a0df37

Comments

ChrisPenner force-pushed the cp/inline-func-calls branch from 78c569f to 43544ad Compare September 5, 2024 23:31

Fix error which drops old combs on update

ebb4956

ChrisPenner force-pushed the cp/inline-func-calls branch from 43544ad to ebb4956 Compare September 5, 2024 23:37

ChrisPenner added 2 commits September 6, 2024 10:18

Fix serialization of RCombs

74da045

Switch so the RComb itself is lazy since that's where it makes the mo…

3f4df40

…st sense conceptually

ChrisPenner commented Sep 6, 2024

View reviewed changes

ChrisPenner added 2 commits September 6, 2024 10:42

Docs

dbf3ccd

Fix test builds

7fc4bd2

ChrisPenner marked this pull request as ready for review September 6, 2024 18:01

pchiusano approved these changes Sep 6, 2024

View reviewed changes

dolio reviewed Sep 9, 2024

View reviewed changes

dolio approved these changes Sep 9, 2024

View reviewed changes

Revert back to O2 on parser-typechecker

e3f658f

ChrisPenner and others added 2 commits September 9, 2024 11:46

Fix Eq on RCombs

5fa076a

automatically run ormolu

aca3e15

ChrisPenner force-pushed the cp/inline-func-calls branch from b9b4ff2 to aca3e15 Compare September 9, 2024 18:50

ChrisPenner merged commit 814f968 into trunk Sep 10, 2024
32 checks passed

ChrisPenner deleted the cp/inline-func-calls branch September 10, 2024 19:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inline code for function calls in interpreter #5330

Inline code for function calls in interpreter #5330

ChrisPenner commented Sep 5, 2024 •

edited

Loading

pchiusano commented Sep 5, 2024

ChrisPenner Sep 6, 2024

dolio Sep 9, 2024

pchiusano left a comment

dolio commented Sep 6, 2024

dolio Sep 9, 2024

ChrisPenner Sep 9, 2024

dolio left a comment

ChrisPenner commented Sep 9, 2024

ChrisPenner commented Sep 9, 2024 •

edited

Loading

dolio commented Sep 9, 2024

ChrisPenner commented Sep 9, 2024 •

edited

Loading

dolio commented Sep 9, 2024 •

edited

Loading

ChrisPenner commented Sep 9, 2024

ChrisPenner commented Sep 10, 2024

		\| Just n <- M.lookup exceptionRef rfTy,
		-- TODO: Should I special-case this raise ref and pass it down from the top rather than always looking it up?

Inline code for function calls in interpreter #5330

Inline code for function calls in interpreter #5330

Conversation

ChrisPenner commented Sep 5, 2024 • edited Loading

Overview

Implementation notes

Interesting/controversial decisions

Test coverage

Loose ends

pchiusano commented Sep 5, 2024

ChrisPenner Sep 6, 2024

Choose a reason for hiding this comment

dolio Sep 9, 2024

Choose a reason for hiding this comment

pchiusano left a comment

Choose a reason for hiding this comment

dolio commented Sep 6, 2024

dolio Sep 9, 2024

Choose a reason for hiding this comment

ChrisPenner Sep 9, 2024

Choose a reason for hiding this comment

dolio left a comment

Choose a reason for hiding this comment

ChrisPenner commented Sep 9, 2024

ChrisPenner commented Sep 9, 2024 • edited Loading

dolio commented Sep 9, 2024

ChrisPenner commented Sep 9, 2024 • edited Loading

dolio commented Sep 9, 2024 • edited Loading

ChrisPenner commented Sep 9, 2024

ChrisPenner commented Sep 10, 2024

ChrisPenner commented Sep 5, 2024 •

edited

Loading

ChrisPenner commented Sep 9, 2024 •

edited

Loading

ChrisPenner commented Sep 9, 2024 •

edited

Loading

dolio commented Sep 9, 2024 •

edited

Loading