Add BuildTargets.md #378

lukewagner · 2024-07-17T18:44:58Z

This PR adds BuildTargets.md to define this new concept of "build targets" as presented in both CG-06 and WASI-06-12, which itself was a revision of the earlier "wasit2" idea that came up in WASI/#595.

Currently, BuildTargets.md only defines one build target, wasm32, but with the intention of adding more in the future (e.g., wasm64 and wasmgc). See BuildTargets.md in this PR for more details.

The goal, mentioned at the end of BuildTargets.md, is that wasi-sdk and various other currently-component-producing toolchains would be able to emit core modules that matched wasm32 as defined here (with "componentification" being merely a final linker option of whether to run wasm-ld or wasm-component-ld). I plan to leave this open until we can get some implementation feedback to test this out.

elewis787 · 2024-08-23T17:34:57Z

Would love to see this added and I am happy to test this. Let me know how/if I can support.

lukewagner · 2024-08-26T17:50:05Z

@elewis787 Great to hear! Probably the next thing we need for this PR to merge is a concrete implementation just to validate what's here makes sense. In particular, one great proof-of-concept would be extending the wit-component crate to accept any core module targeting wasm32 as defined here and emit a wasip2 component that runs on unmodified wasmtime or jco.

elewis787 · 2024-08-26T18:04:58Z

@lukewagner sounds great.

A very basic example showing wasip1 and wasip2 and using WIT to generate bindings can be found here https://github.com/elewis787/wasip1-wasip2.

The goal is to show that Tinygo ( dev branch supporting wasip1 and wasip2 ) can be used to target a component or a module. Once these are compiled, this example shows that the wasmtime-go bindings ( currently wasip1 support only ) can be used for core modules, and wasmtime-cli ( or rust wasmtime, or jco ) can be used for components.

Let me know if this is on the right track for what you would like to see. It may be a slightly different use case. This focuses more on proving that a wit component ( with limitations ) can be a core module per wasip1.

Regardless, I will start digging into the wit-component crate.

jeff1010322 · 2024-08-26T20:24:32Z

I am looking into this as well. Does this mean there would be build target support for generating modules (wasip1 only) versus generating components (wasip2) from compilers?

I have been testing out WebAssembly modules with JCO and through their transpile command I can convert the component into a core module that can be run in runtimes that don't support components like wasmtime-go. I just have to make sure the runtime links any imports called by the core module. In JCO you can manually disable the WASI P2 features with a few flags.

But is there a plan to make core modules without WASI P2 imports a build target for things like this?

alexcrichton · 2024-08-26T20:37:54Z

From a technical perspective almost all components today start as single core modules. This is the case for nearly all compiles for TinyGo, Rust, and C (e.g. LLVM-based toolchains). In all of these cases LLVM emits a single core wasm module which is then fed into wit-component which generates a component. So in that sense the answer to:

But is there a plan to make core modules without WASI P2 imports a build target for things like this?

is "yes that's sort of already supported today". I say "sort of" in that the way this typically works is that the original binary is using a mixture of wasi_snapshot_preview1 APIs plus WIT-bound APIs. That means that it's sort of a hybrid p1/p2 module which doesn't have necessarily clear semantics. The componentization process papers over all of this to produce a single component out the other end.

The goal of this proposal is to formalize the core-module-to-component step of tooling. It makes the core module artifact (which today becomes a component) a first-class artifact. It additionally defines what it means to import/export various names from a core module and this proposal means that most core modules emitted by compilers today are not valid to become a component because they import `wasi_snapshot_preview1.

This means that one big chunk in implementing this proposal is going to be updating language standard libraries to exclusively use WASIp2 APIs instead of a mixture of WASIp1 and WASIp2. This would likely start with wasi-libc and start bubbling up from there for example.

elewis787 · 2024-08-26T20:59:33Z

Thanks @alexcrichton! This is helpful and clear.

as a naive example, does this demonstrate what you are discussing?

import "wasi_snapshot_preview1" "fd_write" with wasip1
vs
import "wasi:filesystem/[email protected]" "[method]descriptor.write" with wasip2

If so, is a byproduct of this a core module that is produced but using wasip2 imports before componentization? This means that in the example I provided the obvious difference is that the module produced does not use wasip2 syntax.

Are there any plans for taking a component to a module to be backward compatible with wasip1? I know wasip1 does not include some of the wasip2 system interfaces and many have changed but I have been able to side step some of the tooling support by implementing what is needed by the core module. This is still useful.

Thanks for walking through regardless!

alexcrichton · 2024-08-26T21:10:54Z

Indeed yeah the differences will come down to imports. The first one you listed is WASIp1, and the second one is WASIp2-as-defined-by-wit-component-today which is not a standard and does not match what this document is proposing. This document would use a WASIp2 target that looks like:

(import "cm32p2|ns:wasi:filesystem/[email protected]" "[method]descriptor.write" (func (param ...)))

Notably the prefix cm32p2|ns: on the import moudle plus the lack of trailing .0 in the import module as well.

If so, is a byproduct of this a core module that is produced but using wasip2 imports before componentization?

Yes, and this document is describing the exact name of component imports/exports to be "componentized", optionally, later on. This document is a formalization of the names used today which is intended to become standard (as opposed to whatever was implemented to begin with)

Are there any plans for taking a component to a module to be backward compatible with wasip1?

Not currently, but that's also sort of out of the scope of this PR

elewis787 · 2024-08-26T21:23:39Z

@alexcrichton, thanks for explaining this.

@jeff1010322 and I are both in the jco exploration phase. As mentioned, we are playing around with the transpile flags to produce core modules. As a next step, do you feel it is helpful to look into the generation of the core module and take a stab at outlining the behavior listed here?

As far as backward compatibility, do you know of a place to discuss/contribute to this?

After using WIT and a few various tools, it seems like there may be some options/benefits in the bindings generated. Candidly, I am still getting up to speed with WASIp2 and may be incorrect.

alexcrichton · 2024-08-26T21:46:05Z

Oh no worries! Always great to have more folks help and happy to help out where I can!

I think the first step is what you're already doing here which is to make sure you feel comfortable with the current process of how a component is created. The high-level overview of that is that source code is written with the help of wit-bindgen frequently, compiled with a language's standard library, and assembled with LLVM's wasm-ld to produce a core wasm module. This module is fed through wit-component to produce a component.

The next step is going to be working backwards in this pipeline from the end back to the front. For example if you were to change the core wasm module today nothing would be able to turn it into a component. In that sense you'll want to do as @lukewagner suggested which is to start with wit-component, the final layer of this stack. That lives in the wasm-tools repository and you'll basically be updating various locations of where a core wasm "thing" is matched up to a component model "thing" to match the conventions outlined in this document. That repository has a whole bunch of handwritten tests in *.wat syntax which you can add to for this.

Once that's all implemented the next step is to probably work on wit-bindgen. That'll be updating all of its generated code to generate references to new names for imports/exports. That'll probably feel more natural once wit-component implemented.

From there the next hard part will be to flesh out language standard libraries, but that's probably best tackled once we're closer to that.

elewis787 · 2024-08-27T00:55:44Z

Great overview. Thanks again!

I'll start looking into this. I am still getting up to date on the various tools but I believe I understand what's required now.

This commit implements recognition of the `_initialize` function from WASIp1 in the componentization process of `wasm-tools component new`. This additionally corresponds to the same function in the proposed [BuildTargets.md](WebAssembly/component-model#378). This is implemented by having a small core wasm module which is just an import and a `start` section get instantiated at the end of a component to run `_initialize` before all other exports.

This commit decouples the string encodings listed for imports/exports from their core wasm names to instead being registered with WIT-level constructs instead. Previously the parsing phase of a module would register a string encoding for core wasm import/export names but this subverted the logic of validation where detection of how exactly an import lines up with WIT-level items is determined. The goal of this commit is to decouple this relation. Worlds are encoding into custom sections with a known string encoding for all imports/exports of that world. This can possibly differ for different parts of an application to theoretically enable one interface to be imported with UTF-8 and another with UTF-16. This means that encodings are tracked per-import/export rather than per-world. Previously this process would assume that there is a single name for an import's/export's encoding but with new detection and names coming down the line this is no longer going to be the case. For example with the new names in WebAssembly/component-model#378 there are new names to be supported meaning that there's not one single name to register encodings with. To help bridge this gap the abstraction here is changed to where metadata for a module records string encodings on a WIT level, for example per WIT import/export, instead of per core wasm import/export. Then during encoding of a component the WIT level constructs are matched up instead of the core names to determine the string encoding in the lift/lower operation. The end goal is that the connection between core wasm names and WIT names continues to be decoupled where validation is the only location concerned about this.

This commit adds support for WebAssembly/component-model#378 to `wit-component`. Notably a new set of alternative names are registered and recognized during the module-to-component translation process. Support for the previous set of names are all preserved and will continue to be supported for some time. The new names are, for now, recognized in parallel to the old names. This involved some refactoring to the validation part of `wit-component` and further encapsulation of various names to one small location instead of a shared location for everywhere else to use as well.

alexcrichton · 2024-09-28T15:30:57Z

I have an implementation of this at bytecodealliance/wasm-tools#1828 which I believe captures everything here.

lukewagner · 2024-09-30T16:18:27Z

Nice! Let me know once things seem far enough along that you feel confident that what's in this PR can match reality.

…odel (#1828) * Use `ExportMap` for naming component exports Use the map's metadata to determine what the core wasm name is for each export instead of recalculating it in the encoder which would duplicate work done in validation. * Decouple import/export encodings from core names This commit decouples the string encodings listed for imports/exports from their core wasm names to instead being registered with WIT-level constructs instead. Previously the parsing phase of a module would register a string encoding for core wasm import/export names but this subverted the logic of validation where detection of how exactly an import lines up with WIT-level items is determined. The goal of this commit is to decouple this relation. Worlds are encoding into custom sections with a known string encoding for all imports/exports of that world. This can possibly differ for different parts of an application to theoretically enable one interface to be imported with UTF-8 and another with UTF-16. This means that encodings are tracked per-import/export rather than per-world. Previously this process would assume that there is a single name for an import's/export's encoding but with new detection and names coming down the line this is no longer going to be the case. For example with the new names in WebAssembly/component-model#378 there are new names to be supported meaning that there's not one single name to register encodings with. To help bridge this gap the abstraction here is changed to where metadata for a module records string encodings on a WIT level, for example per WIT import/export, instead of per core wasm import/export. Then during encoding of a component the WIT level constructs are matched up instead of the core names to determine the string encoding in the lift/lower operation. The end goal is that the connection between core wasm names and WIT names continues to be decoupled where validation is the only location concerned about this. * Remove core wasm name guess in adapter GC This commit removes the need for the GC pass on the adapter module to guess what core wasm export names are needed for WIT. Previously it was assumed that certain exports would have exact core wasm names but that's going to change soon so this refactoring is empowering these future changes. The GC pass for adapters is restructured to run validation over the non-GC'd adapter first. This validation pass will identify WIT export functions and such and then this information is used to determine the set of live exports. These live exports are then used to perform a GC pass, and then afterwards the validation pass is run a second time to recalculate information with possibly-removed imports. * Support the new name mangling scheme for components This commit adds support for WebAssembly/component-model#378 to `wit-component`. Notably a new set of alternative names are registered and recognized during the module-to-component translation process. Support for the previous set of names are all preserved and will continue to be supported for some time. The new names are, for now, recognized in parallel to the old names. This involved some refactoring to the validation part of `wit-component` and further encapsulation of various names to one small location instead of a shared location for everywhere else to use as well. * Update `embed --dummy` with new ABI names This commit updates the `wasm-tools component embed` subcommand, specifically the `--dummy` flag. This flag now uses the new "standard32" names for the core module that is generated. Additionally a new `--dummy-names $FOO` option has been added to enable generating the old names as well as the new names. Utilities have also been added to `Resolve` for bindings generators to avoid hardcoding ABI names and instead use the add categories of imports/exports to name items. * Add a flag to require the new mangling scheme This commit adds a new `--reject-legacy-names` flag to the `wasm-tools component new` subcommand which can be used to disable support for the legacy naming scheme. This is intended to help with testing out the new naming scheme for tools and to help evaluate in the future if it's theoretically possible to remove support for the old naming scheme. * Fix tests * Update some test expectations

sunfishcode

Overall this looks good!

design/mvp/BuildTargets.md

sunfishcode · 2025-01-15T13:13:16Z

design/mvp/BuildTargets.md

+Furthermore, any module matching a Core WebAssembly build target can be
+trivially wrapped (e.g., by [`wasm-tools component new`]) to become a
+semantically-equivalent component and thus these modules can be considered
+**simple components**.


This document currently bundles two distinct concepts: a method of binding to a world, and a set of fixed worlds. I find that confusing, because I expect it'll be useful to use the binding conventions in this document with custom worlds in some contexts.

What would you think about splitting this "Build Targets" document into two documents, one called "Toolchain ABIs" and one called "Build Targets". "Toolchain ABIs" would define cm32p3, cm64p3, and cmgcp3, the rules for start functions and stacks and reentrancy and name mangling and cabi options and memory imports/exports, and then "Build Targets" would tuple up specific Toolchain ABIs with a specific sets of fixed worlds and assign them build target names, such as wasm32-wasip3.

And (brainstorming) perhaps we could also consider renaming the "Canonical ABI" to the "Common ABI"? The word "canonical" calls to mind phrases like "canonical form" which are all about "there are other ways to do this but we're blessing this specific one as canonical". With the word "common", we'd say that the Common ABI defines ABI conventions that are common to multiple Toolchain ABIs. And "common" still leaves linguistic room for custom adapters in the future.

Maybe the wording is unclear, but the intention of this document isn't to bind to any specific world; custom worlds should be given a core module build target by this document just as well as standard ones. There is some wording about "fixing" a world, but that's just for the purpose of the rest of the description so we can just say "the set of imports" (implicitly referring to that fixed world) instead of something more verbose.

Ah, that wasn't clear to me.

I'd suggest rewording the sentence "The rest of this document assumes a single, fixed "target world" to say something more like "build targets are paired with a target world, producing a complete ABI".

Another particular thing that confused me was that the document says build targets provide that "any module matching a Core WebAssembly build target can be trivially wrapped", but this is only true if component new knows the world, or the module contains the special custom section produced by wit-bindgen, which I guessed you wouldn't want to depend on here.

Great points! I'll improve those sentences to be more clear.

sunfishcode · 2025-01-15T13:26:34Z

design/mvp/BuildTargets.md

+* For `wasi-libc`-based toolchains like `wasi-sdk` or `rustc`:
+  * The `--target` is `wasm32-wasip2`, combining the `wasm32` build target
+    defined in this document with the additional information that the language
+    runtime can import [WASI Preview 2]-defined interfaces.


The Target World section above and the use case described above of trivially wrapping a module into a component both seem to say that the set of interfaces a language runtime can import are part of the build target, however this paragraph seems to say that they are additional information to the build target. To me, this points to value in splitting out a separate "Toolchain ABI" document.

Including fixed worlds in the build target concept means that they could correspond 1:1 correspond to compiler flags. Instead of saying that "wasm32" is a build target, we could say that "wasm32-wasip3" is a build target. And we could rename what this document calls "wasm32" to "cm32p3", aligning it with the prefix it corresponds to, and avoiding confusion with the name "wasm32" which has an existing long-standing meaning in toolchains.

So concretely, with this approach, we would say:

wasm32-wasip3 is a build target that combines the wasm32 virtual architecture with the wasip3 virtual OS. wasip3 on wasm32 is defined to use the cm32p3 toolchain ABI (and the "cm32p3" prefix string). wasm32-wasip3 would appear in compiler flags like --target=wasm32-wasip3. GOARCH would be wasm32 and GOOS would be wasip3.

The decision of whether the final output is a module or a component would still be a -buildmode or other independent flag, as currently described.

Agreed in large part; indeed some parts of this doc should be updated to reflect newer discussions with various language toolchains on what to do about WASI interfaces (esp. those in wasi:cli/imports) that are optionally used by various parts of the standard library and language runtime that don't a priori know the "target world" (which, to your point, is indeed orthogonal to the "build target").

Based on the newer discussions, I agree with your second paragraph that cm32p2 (and/or cm32p3, depending on timing) are the "build target(s)" defined by this doc. And then what this "build target" would say is:

"you're using core wasm with 32-bit linear memory"

"you're able to use the "Preview {2,3}" WIT/C-M feature set with the canonopts set by this document"

"you're aware of the existence of the WASI interfaces that have reached Phase 3 and have been stamped with a 0.{2,3}.* version, but whether they are implemented by the host runtime is not guaranteed by the build target -- the interfaces may or may not be implemented by the host and thus you should optionally import them if you don't know for sure that the target world (which is an orthogonal concept from 'build target') includes them".

This leaves open the question of how one "optionally imports". In the medium-term the plan is to add optional to the C-M/WIT, but in the short-term in BuildTargets.md we could simply emit 2 core function imports for each optional import (the one to ask whether you got it and the one that traps if you call it and you didn't get it) with appropriately-mangled names.

Then in, e.g., wasi-sdk, we could say that --target=wasm32-wasip2 is the CLI syntax to ask for the cm32p2 build target whereas in Go, GOARCH=wasm32 and GOOS=wasip2 is the buildconfig to ask for the same cm32p2 buildtarget. The target-world-agnostic lang runtime and stdlib code would then optionally import everything it might want (with graceful fallback if the "did I get it?" function returns "no"), and then wasm-component-ld (which does know the target world) can trivially implement the core function imports appropriately based on this knowledge.

Does that make sense?

To my ears, the name "build target" for this concept sounds like something that end users will recognize, but will recognize as something other than what you're using it to mean here.

That aside, I do like using the name "cm32p2" for this concept because that means it directly corresponds to the name prefix it uses, which is nice.

The wording "you're aware of the existence of the WASI interfaces that have [...]" could perhaps be expressed more generically, to avoid making it sound specific to WASI. Perhaps "there may be features in the target world that not all hosts provide".

I'm not wed to the phrase "Build targets" so we could change it to "Toolchain ABI", as you suggested. (Originally, I had thought that there would be both "core module" and "component" "build targets", but that has since shifted so that now we're only talking about core modules (that may or may not get wrapped into components), so "ABI" makes a lot more sense now.)

The wording "you're aware of the existence of the WASI interfaces that have [...]" could perhaps be expressed more generically, to avoid making it sound specific to WASI. Perhaps "there may be features in the target world that not all hosts provide".

That's a fair point. But cm32p2 is specific to the amorphous "Preview {2,3}" concept. Previews are meant to go away when we hit 1.0, of course, leaving presumably cm32 as the Build Target / Toolchain ABI. And then yeah, whether you choose to (optionally or not) import WASI is orthogonal to the use of cm32 (or cm64 or cm-gc).

design/mvp/BuildTargets.md

Co-authored-by: Dan Gohman <[email protected]>

lukewagner mentioned this pull request Jul 19, 2024

Caller provided buffers question #369

Open

ydnar mentioned this pull request Aug 23, 2024

build tag wasip2 bytecodealliance/go-modules#142

Closed

Define the wasm32 build target

dca900a

lukewagner force-pushed the add-build-targets branch from 3ae88c0 to dca900a Compare August 26, 2024 17:43

alexcrichton mentioned this pull request Aug 29, 2024

Recognize _initialize in wasm-tools component new bytecodealliance/wasm-tools#1747

Merged

rajsite mentioned this pull request Sep 1, 2024

Consider AssemblyScript compatible host bindings CanadaHonk/porffor#201

Open

lukewagner force-pushed the main branch from 5cb57ad to 072b2fa Compare September 7, 2024 19:06

SingleAccretion mentioned this pull request Sep 12, 2024

AOT LLVM - Issue migrating from 8 to 9 (bad wasm file version: 0x1000d (expected 0x1)) dotnet/runtimelab#2685

Closed

lukewagner force-pushed the main branch 3 times, most recently from 824fdc5 to 74bd278 Compare September 18, 2024 22:53

alexcrichton mentioned this pull request Sep 28, 2024

Implement support for the standard mangling scheme in the component model bytecodealliance/wasm-tools#1828

Merged

lukewagner force-pushed the main branch from 8811361 to 4264096 Compare October 4, 2024 21:47

ydnar mentioned this pull request Oct 8, 2024

Discussion: Strategy to remove shell-exec from compilation path bytecodealliance/go-modules#202

Open

lukewagner mentioned this pull request Oct 21, 2024

Add 'stream' and 'future' types #405

Merged

lukewagner mentioned this pull request Nov 5, 2024

wasmtime serve: allow component reuse, serving > 1 request per instance bytecodealliance/wasmtime#9542

Open

lukewagner mentioned this pull request Nov 19, 2024

Relax requirements on assigning integers to resource handles #395

Closed

This was referenced Dec 2, 2024

Question about export naming convention in core module #422

Closed

Identify start function in CM thread.spawn with an index WebAssembly/shared-everything-threads#89

Open

lukewagner mentioned this pull request Jan 14, 2025

Single-module components #435

Open

sunfishcode reviewed Jan 15, 2025

View reviewed changes

Fix links

3c6c4a9

Co-authored-by: Dan Gohman <[email protected]>

tanishiking mentioned this pull request Jan 30, 2025

Add Support for Wasm Component Model and WASIp2 scala-js/scala-js#5121

Open

rajsite mentioned this pull request Feb 21, 2025

Adding document for comment bytecodealliance/sig-embedded#18

Open

lukewagner mentioned this pull request Mar 25, 2025

Proposal: Component Model v0 #479

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add BuildTargets.md #378

Add BuildTargets.md #378

lukewagner commented Jul 17, 2024

elewis787 commented Aug 23, 2024

lukewagner commented Aug 26, 2024

elewis787 commented Aug 26, 2024 •

edited

Loading

jeff1010322 commented Aug 26, 2024

alexcrichton commented Aug 26, 2024

elewis787 commented Aug 26, 2024 •

edited

Loading

alexcrichton commented Aug 26, 2024

elewis787 commented Aug 26, 2024

alexcrichton commented Aug 26, 2024

elewis787 commented Aug 27, 2024

alexcrichton commented Sep 28, 2024

lukewagner commented Sep 30, 2024

sunfishcode left a comment

sunfishcode Jan 15, 2025

lukewagner Jan 15, 2025

sunfishcode Jan 16, 2025

lukewagner Jan 16, 2025

sunfishcode Jan 15, 2025 •

edited

Loading

lukewagner Jan 15, 2025

sunfishcode Jan 16, 2025

lukewagner Jan 16, 2025

Add BuildTargets.md #378

Are you sure you want to change the base?

Add BuildTargets.md #378

Conversation

lukewagner commented Jul 17, 2024

elewis787 commented Aug 23, 2024

lukewagner commented Aug 26, 2024

elewis787 commented Aug 26, 2024 • edited Loading

jeff1010322 commented Aug 26, 2024

alexcrichton commented Aug 26, 2024

elewis787 commented Aug 26, 2024 • edited Loading

alexcrichton commented Aug 26, 2024

elewis787 commented Aug 26, 2024

alexcrichton commented Aug 26, 2024

elewis787 commented Aug 27, 2024

alexcrichton commented Sep 28, 2024

lukewagner commented Sep 30, 2024

sunfishcode left a comment

Choose a reason for hiding this comment

sunfishcode Jan 15, 2025

Choose a reason for hiding this comment

lukewagner Jan 15, 2025

Choose a reason for hiding this comment

sunfishcode Jan 16, 2025

Choose a reason for hiding this comment

lukewagner Jan 16, 2025

Choose a reason for hiding this comment

sunfishcode Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

lukewagner Jan 15, 2025

Choose a reason for hiding this comment

sunfishcode Jan 16, 2025

Choose a reason for hiding this comment

lukewagner Jan 16, 2025

Choose a reason for hiding this comment

elewis787 commented Aug 26, 2024 •

edited

Loading

elewis787 commented Aug 26, 2024 •

edited

Loading

sunfishcode Jan 15, 2025 •

edited

Loading