Loaders allow breaking JS spec invariants #171

nicolo-ribaudo · 2023-11-08T10:02:01Z

Consider this file:

import * as staticImport from "a";

const dynamicImport = await import("a");

console.log(staticImport === dynamicImport);

Two imports in the same file must always resolve to the same module, so that console.log must always log true. However, the following loader allow breaking that language invariant and console.log will log false:

import { register } from "node:module"

register(import.meta.url);

let counter = 0;

export function resolve(specifier, context, nextResolve) {
  if (specifier === "a") {
    return nextResolve(counter++ ? "./a.js" : "./b.js", context);
  }
  return nextResolve(specifier, context);
}

For reference, the relevant specification is https://tc39.es/ecma262/#sec-HostLoadImportedModule:

If this operation is called multiple times with the same (referrer, specifier) pair and it performs FinishLoadingImportedModule(referrer, specifier, payload, result) where result is a normal completion, then it must perform FinishLoadingImportedModule(referrer, specifier, payload, result) with the same result each time.

Where referrer is the importer module, specifier is the string passed to import ("a" in both cases in this example), and "the result is a normal completion" means that loading the module does not throw.

The text was updated successfully, but these errors were encountered:

aduh95 · 2023-11-08T10:38:25Z

Custom hooks allows you work with non-JS files, e.g. you can make import './file.ts' with custom hooks, and I don't think anyone will complain that the content of file.ts is not parsed in conformance with ECMAScript spec, it's the point of custom hooks.

It's possible to build a hook that's conform to the ECMAScript spec, albeit not very practical as you have to maintain your own resolve cache, but doable, so I'd argue there's nothing that needs to be done from our side. wdyt?

nicolo-ribaudo · 2023-11-08T11:17:19Z

Custom hooks allows you work with non-JS files, e.g. you can make import './file.ts' with custom hooks, and I don't think anyone will complain that the content of file.ts is not parsed in conformance with ECMAScript spec, it's the point of custom hooks.

Well, that is spec-conformant 😛 The spec only gives guarantees about how code inside JS modules/scripts executes, but allows loading modules of other types (think for example about JSON, CSS and WASM modules).

It's possible to build a hook that's conform to the ECMAScript spec, albeit not very practical as you have to maintain your own resolve cache, but doable, so I'd argue there's nothing that needs to be done from our side. wdyt?

I think this is an ok position -- loaders are indeed an advanced feature and the expectation from Node.js could be that people writing them should know what they are doing.

The risk I see here is that a naively implemented loader might accidentally not behave according to the spec. As a user, I would then expect my modules to at least follow JS semantics: they obviously will not follow Node.js semantics, given that well... I'm using a loader to deviate, but I would write my code assuming that "bare JS semantics that work everywhere" will keep being held.

I wonder if the loaders API could do something so that the default behavior is to do "the correct thing" rather than expecting loaders authors to be aware of it.

When a module is loaded twice from the same file, there is some internal caching the load() hook is called just once. Maybe Node.js could implement similar caching for the resolve()` hook, and call it just once (using (importer_module, unresolvedSpecifier) as the cache key).

aduh95 · 2023-11-08T11:24:42Z

I wonder if the loaders API could do something so that the default behavior is to do "the correct thing" rather than expecting loaders authors to be aware of it.

Oh for sure, we ended up implementing a resolve cache precisely to match that part of the spec, see nodejs/node#46662. It was decided that we should stop using that cache as soon as a custom loader is being registered, so in a way we delegate that to hooks authors to "do the right thing" indeed. It'd be nice if there was a way to keep using that cache unless a hook opts-out of it, because there are use cases for which one would want to deviate from the spec, but I couldn't find a way to do it, so that's the situation today.

nicolo-ribaudo · 2023-11-08T11:39:32Z

Thank you for the context :) I see that this has been already discussed explicitly. Feel free to close this issue, or to re-use is as a feature-request for allowing loaders to rely on the built-in resolve cache.

aduh95 · 2023-11-08T13:26:04Z

I think it makes sense to keep it open as a feature request, if someone finds a way to make it opt-out, it would certainly be a welcome change.

ljharb · 2023-11-08T15:40:06Z

What are the use cases where a loader would want to deviate from the spec here?

laverdet · 2023-11-13T17:36:22Z

Custom hooks allows you work with non-JS files, e.g. you can make import './file.ts' with custom hooks, and I don't think anyone will complain that the content of file.ts is not parsed in conformance with ECMAScript spec, it's the point of custom hooks.

The specification provides a means for this. A TypeScript module would be a concrete implementation of a Cyclic Module Record, in the same way Source Text Module is. For all practical purposes this would implemented as a transpilation stage on top of Source Text Module, but regardless running TypeScript in a JavaScript runtime is not a violation of the specification. Otherwise WebAssembly, which is not defined in es-262, would not be compliant.

Given the discussion here: nodejs/node#50618 do we want to revisit the register hook? @ljharb do you have an opinion one way or another? This feature explicitly allows any module to tinker with module resolution in a specification-breaking way without an explicit flag to the CLI.

ljharb · 2023-11-13T18:50:20Z

I think that if it becomes stable and/or commonplace for any JS code in node_modules to be able to change how module resolution works, it will be a security and PR disaster for node. It wouldn't take a security researcher much time to generate enough CVEs that the entire loader feature would have to be removed, or at least restricted to only operating in the startup phase (through --import or another CLI flag).

It's certainly not my decision, but continuing on this course would be a grave mistake, especially in the current climate where governments, including the US, are pushing very hard on supply chain security concerns.

laverdet · 2023-11-13T19:27:10Z

I guess I'm confused about the security argument. Maybe under deno it would be a concern because in that ecosystem system capabilities are strictly opt-in. Given that, in nodejs, every module has complete access to the running user's account [including the capability to run arbitrary binary code] isn't that a greater concern?

Anyway this is pretty off-topic but the design choice of making register a runtime function is baffling to me for entirely different reasons. Seems like I missed the boat on that discussion though and it's just something that we'll have to tolerate.

GeoffreyBooth · 2023-11-13T20:56:46Z

Given that, in nodejs, every module has complete access to the running user’s account [including the capability to run arbitrary binary code] isn’t that a greater concern?

Agreed, I don’t see the security concern here. It’s always been part of Node’s security model that all code, including code within node_modules, is considered trusted code. There’s no sandboxing of dependencies or protection from anything that they can do.

register works at runtime because of the move of module hooks to run off thread, and the need for many hooks to do things on the main thread. One of the primary purposes of register is to set up the communications channel with the initialize hook, so that actions can have effects both off-thread and on the main thread. That’s a lot to try to accomplish via CLI flags. The recommended pattern is to call register within a file referenced by --import, so that it’s loaded before application code. Any hooks registered later would only effect modules import()-ed after registration, which is little to nothing for most apps.

laverdet · 2023-11-13T21:35:54Z

require.register was just an awful experience as a module consumer so I was super disappointed to see it come back. A lot of naive module authors add something like require('ts-node/register') at the top of their project, publish unstripped TypeScript and call it a day. Then deep in some dependency you end up with several versions of TypeScript running for no reason. It can be hell to track down. It shouldn't be configurable at runtime or you'll see the same thing ESM. Or at the very least invocations to register should throw once the main entrypoint has been entered to prevent this behavior spreading.

GeoffreyBooth · 2023-11-13T22:25:04Z

A lot of naive module authors add something like require('ts-node/register') at the top of their project

We need to provide equivalent functionality in ESM, though. Some people can’t define the flags used to run Node for whatever reason, and need to do things like add an instrumentation library at runtime. (You’ve probably seen instructions like “ensure that require('instrumentation-library') is the first line of your application.”) Yes it’s unfortunate that the API can be misused, but that’s the case for any API. We do try to limit footguns that we provide users, but some things are footguns in one context but completely necessary in another.

Really the lesson here is that users need to be responsible for their dependencies. They don’t have to read every line of source code of a dependency, but there are good tools now for scanning libraries and they can find red flags to warn about misbehaving dependencies.

aduh95 · 2023-11-13T23:04:26Z

We need to provide equivalent functionality in ESM, though

You say "we need", but I think it's more accurate to say "I want". Nothing's wrong with that, but it's only fair to acknowledge it's a self imposed limit, and is up for debate.

GeoffreyBooth · 2023-11-14T01:03:46Z

You say “we need”, but I think it’s more accurate to say “I want”.

It’s even more accurate to say that it’s an explicitly defined goal for the API: https://github.com/nodejs/loaders#milestone-1-parity-with-commonjs. The point of creating these hooks was to allow the user to do in ESM whatever they could do in CommonJS. I would be fine with removing the ability to register hooks after entry point evaluation once we also remove the ability to monkey-patch in CommonJS, to preserve parity.

ljharb · 2023-11-14T01:39:09Z

So it sounds like you're saying that monkey-patching in CJS is explicitly part of the API, which is why parity for it is appropriate?

Your previous statements are that it's not part of the API, which would mean parity would not be appropriate.

arcanis · 2023-11-14T08:22:34Z

It seems to me this discussion is getting very off-topic.

the cache is used only when there are no loaders involved (there wasn't consensus if we wanted to also have a cache when loaders are involved; the upside is that it simplify the cache setup as the default resolve is sync, so no async failure to take care of).

Do you have details on the discussion around that (I'm also curious about some of the anticipated use cases for resolvers returning unstable results; I can imagine some, but they don't seem very practical)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loaders allow breaking JS spec invariants #171

Loaders allow breaking JS spec invariants #171

nicolo-ribaudo commented Nov 8, 2023

aduh95 commented Nov 8, 2023

nicolo-ribaudo commented Nov 8, 2023 •

edited

Loading

aduh95 commented Nov 8, 2023

nicolo-ribaudo commented Nov 8, 2023

aduh95 commented Nov 8, 2023

ljharb commented Nov 8, 2023

laverdet commented Nov 13, 2023

ljharb commented Nov 13, 2023

laverdet commented Nov 13, 2023

GeoffreyBooth commented Nov 13, 2023

laverdet commented Nov 13, 2023

GeoffreyBooth commented Nov 13, 2023

aduh95 commented Nov 13, 2023

GeoffreyBooth commented Nov 14, 2023

ljharb commented Nov 14, 2023

arcanis commented Nov 14, 2023

Loaders allow breaking JS spec invariants #171

Loaders allow breaking JS spec invariants #171

Comments

nicolo-ribaudo commented Nov 8, 2023

aduh95 commented Nov 8, 2023

nicolo-ribaudo commented Nov 8, 2023 • edited Loading

aduh95 commented Nov 8, 2023

nicolo-ribaudo commented Nov 8, 2023

aduh95 commented Nov 8, 2023

ljharb commented Nov 8, 2023

laverdet commented Nov 13, 2023

ljharb commented Nov 13, 2023

laverdet commented Nov 13, 2023

GeoffreyBooth commented Nov 13, 2023

laverdet commented Nov 13, 2023

GeoffreyBooth commented Nov 13, 2023

aduh95 commented Nov 13, 2023

GeoffreyBooth commented Nov 14, 2023

ljharb commented Nov 14, 2023

arcanis commented Nov 14, 2023

nicolo-ribaudo commented Nov 8, 2023 •

edited

Loading