-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move structured import/export name information into the import/export string #263
Conversation
8bd18a7
to
7a01c2e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good to me 👍
5efe6f8
to
12c00e8
Compare
To confirm, these two binary rules: importdecl ::= in:<importname> ed:<externdesc> => (import in ed)
exportdecl ::= en:<exportname> ed:<externdesc> => (export en ed) should change to refer to |
This commit removes the `wasmparser::ComponentExternName` type and updates wasmparser to handle WebAssembly/component-model#263. Names are now always stored as a strings and are not discriminated in the binary format. This additionally refactors the internals of `wasmparser` to handle parsed names differently and updates tools like `wit-component` to handle the new API changes. Test aren't passing yet but things are compiling at least.
One other piece of feedback from bytecodealliance/wasm-tools#1262, an implementation of this (and a few other PRs to the component model). Currently URLs are defined as "stuff that doesn't contain I'll admit though I know nothing about |
@alexcrichton That's a good question and I was wondering about the same options for the same reason. Looking at the EBNF of But another reason in favor of having the Component Model fully parse |
Ok makes sense to me! I'll dig in and implement that |
This commit removes the `wasmparser::ComponentExternName` type and updates wasmparser to handle WebAssembly/component-model#263. Names are now always stored as a strings and are not discriminated in the binary format. This additionally refactors the internals of `wasmparser` to handle parsed names differently and updates tools like `wit-component` to handle the new API changes. Test aren't passing yet but things are compiling at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really great! I'm reviewing the implementation side of this currently as well.
Awesome, thanks to you and Alex for the feedback. I'll merge now since there don't seem to be any open issues and we can file follow-up issues if needed as usual. |
#1262) * Update parsing of component model names This commit removes the `wasmparser::ComponentExternName` type and updates wasmparser to handle WebAssembly/component-model#263. Names are now always stored as a strings and are not discriminated in the binary format. This additionally refactors the internals of `wasmparser` to handle parsed names differently and updates tools like `wit-component` to handle the new API changes. Test aren't passing yet but things are compiling at least. * Update `*.wat` parsing and some tests Also update `wasm-encoder`'s API for the new AST. * Update to allow import/export overlap Handles WebAssembly/component-model#259 * Update wit-component test expectations * Fix tests * Update wasm-encoder to produce "old" binaries In wasm-encoder produce the "old" encoding by default rather than the new encoding. This can get removed once enough tooling supports the "new" encoding. * Implement parsing of new-style names Specified in WebAssembly/component-model#263 * Add comments * Update some TODO * Disallow some kinds of strings in exports * Update parsing of integrity metadata * Review feedback
Starting in WebAssembly/component-model#263 the component model binary specification was updated in a technically breaking way to encode binaries differently. This was intended to be rolled out in a manner that minimized breakage however so bytecodealliance#1262 implemented validation where the prefix byte which changed was actually ignored if it was 0 or 1. The encoder at the time still emitted 1 as the prefix byte, however. The intention was that once the validator had percolated enough the encoder would switch to using 0. Unfortunately this change was basically forgotten about until now with WebAssembly/component-model#391, but now's probably "enough time passed" so the encoder is updated to emit a 0x00 leading byte in this situation instead of the historical 0x01 byte. This will start the transition period to eventually removing the validator's acceptance of the 0x01 byte.
Starting in WebAssembly/component-model#263 the component model binary specification was updated in a technically breaking way to encode binaries differently. This was intended to be rolled out in a manner that minimized breakage however so bytecodealliance#1262 implemented validation where the prefix byte which changed was actually ignored if it was 0 or 1. The encoder at the time still emitted 1 as the prefix byte, however. The intention was that once the validator had percolated enough the encoder would switch to using 0. Unfortunately this change was basically forgotten about until now with WebAssembly/component-model#391, but now's probably "enough time passed" so the encoder is updated to emit a 0x00 leading byte in this situation instead of the historical 0x01 byte. This will start the transition period to eventually removing the validator's acceptance of the 0x01 byte.
* Proceed to the next step in the "remove `interface`" transition Starting in WebAssembly/component-model#263 the component model binary specification was updated in a technically breaking way to encode binaries differently. This was intended to be rolled out in a manner that minimized breakage however so #1262 implemented validation where the prefix byte which changed was actually ignored if it was 0 or 1. The encoder at the time still emitted 1 as the prefix byte, however. The intention was that once the validator had percolated enough the encoder would switch to using 0. Unfortunately this change was basically forgotten about until now with WebAssembly/component-model#391, but now's probably "enough time passed" so the encoder is updated to emit a 0x00 leading byte in this situation instead of the historical 0x01 byte. This will start the transition period to eventually removing the validator's acceptance of the 0x01 byte. * Update test expectation
Update to the latest wit-bindgen, wasm-tools, warg, and wasi-preview1-component-adapter-provider crates. In particular, this updates to the wasm-encoder that always uses a leading zero byte, following WebAssembly/component-model#263. Fixes bytecodealliance#350.
* Update to the latest wit-bindgen etc. Update to the latest wit-bindgen, wasm-tools, warg, and wasi-preview1-component-adapter-provider crates. In particular, this updates to the wasm-encoder that always uses a leading zero byte, following WebAssembly/component-model#263. Fixes #350. * Update the tests for the new wit-component API. * rustfmt * When assigning a world a new name, update `pkg.worlds` too.
This PR fleshes out the idea in #253 to move the structured information of imports/exports into the import/export name string, maintaining the same validation structure, but now inside a single quoted strong. The motivation for this change is to have a simple canonical string for all import/export names that contains all the structured information, without requiring host and toolchain implementations to define their own ad hoc string-serialization conventions. The rationale here is basically the same as the earlier design choice to put method/constructor/etc annotations inside the name strings, with the net effect of making the way externally-meaningful name information is represented in components somewhat more regular.
Some specific details:
(interface ...)
imports.<name>
refer to the kebab-name case. Moreover,<name>
wasn't really a "kebab-name" anyway, since it also contains method/constructor/static annotations (with embedded square brackets and periods). Thus, this PR proposes to rename this production to<plainname>
, attempting to distinguish plain names from all the other kinds of names that carry more semantic information, but happy to hear other thoughts on how to call this.Copying from this PR, an example showing the new import/export names is: