Skip to content
This repository has been archived by the owner on Dec 7, 2024. It is now read-only.

Alternative sections #10

Open
rossberg opened this issue Mar 17, 2022 · 2 comments
Open

Alternative sections #10

rossberg opened this issue Mar 17, 2022 · 2 comments

Comments

@rossberg
Copy link
Member

In the discussion on #6, two observations have been made:

  1. Limiting alternative code path choices to function granularity seems to be sufficient, at least for the primary gcc/clang use case.

  2. On the other hand, it is not sufficient to enable alternative choices in the code section alone:

    • If a choice affects a function's type signature, then at least the function section also needs to make a corresponding choice.
    • If one of the choices of type signature is not understood by all engines (e.g., using a new or optional type), then a corresponding choice has to be made in the type section already, where the function signatures are stored.
    • Alternative representations of constants may need to be stored in data sections.
    • In a scenario storing SIMD or alternative representations in e.g. GC types, the type section is also affected, as may be the global or element section.

In the light of this, I'd suggest to reconsider a more general mechanism operating on the level of sections.

The conditional sections proposal did that, but had one significant drawback, namely that it was too liberal and allowed the resulting sections to have completely different sizes (including absence), which would make it difficult for tools to process a module coherently.

We could refine this as follows:

  • Instead of a unary construct

    #if <condition> <section> #endif
    

    we change it to an n-ary construct

    #if <condition> <section> (#if <condition> <section>)* #else <section> #endif
    

    where all of the section alternatives must have the same type and size.

  • As before, this is combined with the ability to have multiple occurrences of each section type (like we already want for other reasons as well), such that a conditional can be reduced to a diff.

  • Separately, we can revisit what the representation of "conditions" is.

The n-ary construct mirrors the #if-#elif of C. Crucially, it enforces "well-formedness" of the index spaces created.

In terms of the binary format, such a section conditional would perhaps only store the section type and size once, as a form of "type annotation" on the conditional itself, instead of repeating it in every nested section (which then would only contain the section body).

Honestly, I expect that it is no more complex to define this conditional construct generically for all section types than to have separate equivalent constructs for (at least) code, function, and type section.

@tlively
Copy link
Member

tlively commented Mar 18, 2022

  • If a choice affects a function's type signature, then at least the function section also needs to make a corresponding choice.
  • If one of the choices of type signature is not understood by all engines (e.g., using a new or optional type), then a corresponding choice has to be made in the type section already, where the function signatures are stored.
  • Alternative representations of constants may need to be stored in data sections.
  • In a scenario storing SIMD or alternative representations in e.g. GC types, the type section is also affected, as may be the global or element section.

None of these apply to the LLVM use case, which does not allow a function's signature to differ based on the target features and does not provide a mechanism that I'm aware of for changing data based on features. For future GC use cases, I strongly believe that SIMD would only be useful for operating on arrays of i8 (or maybe i16 or i32 or i64), so there would be no user demand for alternative data representations for different feature sets.

I agree that this section-based design would be workable, but I do think it would be more work to implement and would be a larger and more cross-cutting change to the structure of Wasm modules overall for no additional benefit.

@penzn
Copy link
Contributor

penzn commented Apr 27, 2022

None of these apply to the LLVM use case, which does not allow a function's signature to differ based on the target features and does not provide a mechanism that I'm aware of for changing data based on features. For future GC use cases, I strongly believe that SIMD would only be useful for operating on arrays of i8 (or maybe i16 or i32 or i64), so there would be no user demand for alternative data representations for different feature sets.

If we want to rely on existing compiler infrastructure, it would work better with the signature staying the same, likely by taking pointers to memory, and it would probably involve automatically generated guards based on what the function uses internally (or what the user said it uses).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants