Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(compartment-mapper/node): support anonymous entrypoints #2664

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

boneskull
Copy link
Contributor

@boneskull boneskull commented Dec 18, 2024

An "anonymous" entrypoint is one where the closest package.json does not contain a name field. It is not unusual for CJS packages to create a subdirectory containing package.json like this:

{"type": "module"}

The effect is that any .js file in this subdir will be treated as an ECMAScript module by Node.js.

A problem arises when a module in this subdir is provided as an entrypoint to mapNodeModules().

Note

Why the hell would anyone do this? I just wanted to see if I could generate policy for all of @babel/runtime's exports.

In this case, the "root" package descriptor has no name. Ultimately, the resulting shape of the CompartmentMapDescriptor fails assertCompartmentMap(), presumably due to an undefined value (or "undefined" string!) where one was not expected.

To avoid the failure, we can assign a string value to the root package descriptor's name field.

This is very much a corner case and was discovered by running arbitrary packages through mapNodeModules() (specifically, @babel/runtime).

  • Also added a test for this
  • Also added a broad/shallow test case for mapNodeModules() since one did not exist

Questions at time of commit:

  • Does the name need to be more unique to avoid collisions with other packages in the dependency tree?
  • Severe enough to pull in node:crypto?
  • Can there ever be more than one "anonymous" package?

@boneskull boneskull requested a review from kriskowal December 18, 2024 00:45
@boneskull boneskull self-assigned this Dec 18, 2024
@boneskull boneskull added the bug Something isn't working label Dec 18, 2024
@boneskull boneskull force-pushed the boneskull/anonymous-entry branch from c38c16d to 647c2f7 Compare December 18, 2024 00:47
@boneskull boneskull requested a review from naugtur December 18, 2024 01:49
Copy link
Member

@kriskowal kriskowal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A negative side-effect of this change is that modules under the anonymous package.json would not be able to find any of their dependencies. In Node.js, this is not a problem because Node.js does not consult package.json, but for the Compartment Mapper, this would be fatal.

I would like to consider a similarly surgical change that would probably have the same intended effect: We can alter the search() function to skip any package.json that lacks a name property, such that it would fall through to the true package root. These {"type": "module"} package.jsons just alter the interpretation of .js extensions in a subtree of the containing package.

We might need to do further work to ensure that the .js extension for the inner and outer “packages” get interpreted consistently for their respective subtrees. I think that would entail a moduleTypeHook for importHook that closes over the same package descriptor memo as mapNodeModules, allowing that memo to grow in response to importHook as needed.

@kriskowal
Copy link
Member

I could also imagine solving this problem with an alternate implementation of mapNodeModules and languageHook that does not consult package.json for dependencies at all and instead follows the Node.js search algorithm closely, and note that would have the drawback that it would, like Node.js, only reveal missing dependency declarations after they have been published and consumed elsewhere.

@boneskull
Copy link
Contributor Author

A negative side-effect of this change is that modules under the anonymous package.json would not be able to find any of their dependencies. In Node.js, this is not a problem because Node.js does not consult package.json, but for the Compartment Mapper, this would be fatal.

If Node.js didn't consult package.json then these {"type": "module"} manifests wouldn't work at all 😄

@boneskull
Copy link
Contributor Author

@kriskowal

But anyhow, the more correct solution is what you suggested: "inherit" from the parent package.json. There seem to be some helper functions.

I could also imagine solving this problem with an alternate implementation of mapNodeModules and languageHook that does not consult package.json for dependencies at all and instead follows the Node.js search algorithm closely, and note that would have the drawback that it would, like Node.js, only reveal missing dependency declarations after they have been published and consumed elsewhere.

This is Lavamoat, so we wish to (reasonably) maximize out-of-box compat with Node.js.

@naugtur This involves some of the stuff we were talking about today. I thought you might have opinions on the API.

@naugtur
Copy link
Member

naugtur commented Jan 14, 2025

I would like to consider a similarly surgical change that would probably have the same intended effect: We can alter the search() function to skip any package.json that lacks a name property, such that it would fall through to the true package root. These {"type": "module"} package.jsons just alter the interpretation of .js extensions in a subtree of the containing package.

This sounds like it could work.

We might need to do further work to ensure that the .js extension for the inner and outer “packages” get interpreted consistently for their respective subtrees. I think that would entail a moduleTypeHook for importHook that closes over the same package descriptor memo as mapNodeModules, allowing that memo to grow in response to importHook as needed.

If moduleTypeHook can be avoided, I'd like that. In node.js loaders the module type is part of the result of calling the equivalent of importHook (part of the return value - the record in our case) and a responsibility of the importHook to provide. Are we not implementing similar logic?

This involves some of the stuff we were talking about today. I thought you might have opinions on the API.

API opinions are not really involved in this PR, and I'll use that as pretext instead of admitting I am not ready to give feedback on the API that's substantially better than "not this" at this time.
All I have now is:
Looking at the interaction between @lavamoat/node and @endo/compartment-mapper in the policy generation part of LavaMoat I concluded that too much of endo is exposed to and facilitated by LavaMoat in there. So we'd need a higher level API with configuration or hooks to wrap what we currently expose.

An "anonymous" entrypoint is one where the closest `package.json` does not contain a `name` field. It is not unusual for CJS packages to create a subdirectory containing `package.json` like this:

```json
{"type": "module"}
```

The effect is that any `.js` file in this subdir will be treated as an ECMAScript module by Node.js.

A problem arises when a module in this subdir is provided as an entrypoint to `mapNodeModules()`. In this case, the "root" package descriptor _has no name_. Ultimately, the resulting shape of the `CompartmentMapDescriptor` fails `assertCompartmentMap()`, presumably due to an `undefined` value (or `"undefined"` string!) where one was not expected.

To avoid the failure, we can assign a string value to the root package descriptor's `name` field.

This is very much a corner case and was discovered by running arbitrary packages through `mapNodeModules()` (specifically, `@babel/runtime`).

- Also added a test for this
- Also added a broad/shallow test case for `mapNodeModules()` since one did not exist

* * *

Questions at time of commit:

- Does the name need to be more unique to avoid collisions with other packages in the dependency tree?
- Severe enough to pull in `node:crypto`?
- Can there ever be more than one "anonymous" package?
@naugtur naugtur force-pushed the boneskull/anonymous-entry branch from 647c2f7 to bf8c687 Compare January 14, 2025 13:32
@boneskull
Copy link
Contributor Author

This is tricky.

On the one hand, it looks like we should be able to treat the parent and any "anonymous" package in its scope as a single Compartment, and then configure types (LanguageForModuleSpecifier) on a file-by-file (folder by folder?) basis to tell it which files are ESM and which are CJS. At least, I think that's what that's for? This would make the presence of .js in LanguageForExtension invalid.

On the other, the anonymous package can specify its own dependencies. Node.js doesn't care; if it's in node_modules, it works. But does a dependency in an anonymous package mean its parent should also have access? If the anon pkg doesn't have its own compartment, is it unreachable from other packages?

This is further complicated by import maps. I will not think about that right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants