feat(docs): `asm` functions #1061

novusnota · 2024-11-19T02:22:30Z

Rewrote the method ID collisions section to remove all logical jumps and make it much more streamlined :)

Also adjusted the structure a little towards the upcoming PR revamping this page. I'll push the draft of it right after we deal with asm functions here.

P.S.: I actually call argument to return position mappings "arrangements" and not "shuffle" as in grammar.ohm, because the latter kinda implies randomness, while those are actually deterministic. Hence, "asm arrangments".

Issue

Closes Docs: describe asm functions #1011.
Closes Describe how to use TVM assembly from Tact tact-docs#300 (old issue in tact-docs)

Also, resolved two teeny tiny issues from tact-docs — virtually 3-5 lines of fixes for each, no need in a separate CHANGELOG entry:

Checklist

I have updated CHANGELOG.md
I have run the linter, formatter and spellchecker
I did not do unrelated and/or undiscussed refactorings

tact-lang/tact-docs#368

tact-lang/tact-docs#374

… make it more streamlined

docs/src/content/docs/book/functions.mdx

docs/src/content/docs/book/import.mdx

docs/src/content/docs/book/functions.mdx

Co-authored-by: Anton Trunov <[email protected]>

docs/src/content/docs/book/functions.mdx

jeshecdom · 2024-11-20T16:16:01Z

docs/src/content/docs/book/functions.mdx

+asm(-> 1 0) extends mutates fun asmLoadCoins(self: Slice): Int { LDVARUINT16 }
+//     ---
+//     Notice, that return values are best thought as tuples with indexed access into them
+//     and not as bottom-up representation of stack values


Probably we need to clarify this in a more detailed example (tutorial-like), because the connection with the stack is very vague and confusing. I can do such detailed example in a separate page, and then I can add a link in here to such page. I can work on that example as a separate PR, since now I am working on writing docs. What do you think? Do you suggest I should use the same TVM instructions you were using in these examples or should I use another TVM instruction that produces at least 3 or more results to exemplify the use of structs?

I think that makes sense. If that's a tutorial-like example (for the Cookbook, I presume), then I'd use some other instruction with 3+ results to exemplify the use of Structs, exactly :)

UPD: Well, since my original answer I discovered that unused values are discarded only when the primitive type is specified as the return type. And now the idea of a tutorial makes much less sense to me — it's better that we improve descriptions here and now. But I may be totally wrong

jeshecdom · 2024-11-20T18:01:02Z

I agree that this is confusing, and I think that the cause is that we are mixing two mental models: a stack (low-level) and tuples (high-level).

We should pick only one and stick with it the entire explanation. It seems that describing everything in terms of tuples is more intuitive, but then we should not mention the stack (or mention how tuples are pushed and popped from a stack in a separate subsection, and only in that section mention the stack).

So, for example, I would start the examples saying something about the TVM instructions, something like this:

"
Even though TVM instructions work with a stack, TVM instructions can be seen, intuitively, as maps from tuples to tuples. To see how TVM instructions map tuples to tuples in the TVM stack, see [link: here]. Thinking in terms of tuples makes the explanation of asm functions much clearer, but for those who want to see an explanation using the stack directly, see [link: here].
"

Then, explaining the meaning of a declaration like:

asm(len self -> 1 0) fun testFun(self: Slice, len: Int): Result { TVM_INSTRUCTION }

struct Result {
   res1: Int;
   res2: Bool;
}

amounts to saying simply:

"
testFun passes the argument tuple (len, self) to instruction TVM_INSTRUCTION. Suppose (r0, r1) is the tuple result of TVM_INSTRUCTION, then testFun reorders the result according to the map -> 1 0, i.e., the 1-th index element (r1) has now index 0, and the 0-th index element (r0) has now index 1, producing the tuple (r1, r0). Finally, testFun assigns the tuple (r1, r0) into the Result struct one field at a time, producing Result {res1: r1, res2: r0}.
"

novusnota · 2024-11-20T19:09:26Z

@jeshecdom interesting note. The cases with multiple instructions should be covered too. And tuples on TON are denoted with square brackets [], so it's better to use those. Also, it's best for readability to remove parentheses as much as possible, so things like "and the 0-th index element (r0)" will be "and the 0-th index element r0" — no need to add indirection and visual pauses with parens :)

@anton-trunov wdyt about #1061 (comment)?

anton-trunov · 2024-11-21T08:02:39Z

TVM instructions can be seen, intuitively, as maps from tuples to tuples.

this is incorrect in a very specific technical sense: a tuple is a TVM data structure that occupy precisely one TVM stack position but can contain multiple other TVM primitives, including tuples

the term you probably intended to use is tensor

anton-trunov · 2024-11-21T08:03:51Z

testFun passes the argument tuple (len, self)

I find it confusing

anton-trunov · 2024-11-21T08:04:20Z

@novusnota just adapt the corresponding calling convention description from tvm.pdf

anton-trunov · 2024-11-21T08:09:31Z

@novusnota you also need to check how structures that are returned from a function are actually encoded

anton-trunov · 2024-11-21T08:10:51Z

To fully finish this section, #910 needs to be resolved too

jeshecdom · 2024-11-21T08:52:18Z

TVM instructions can be seen, intuitively, as maps from tuples to tuples.

this is incorrect in a very specific technical sense: a tuple is a TVM data structure that occupy precisely one TVM stack position but can contain multiple other TVM primitives, including tuples

the term you probably intended to use is tensor

I meant mathematical tuple, but now I see that this would introduce much more confusion because of the technical terms in TVM. So, the explanation should stick with the stack and use the technical terms in TVM.

jeshecdom · 2024-11-21T08:58:43Z

@jeshecdom interesting note. The cases with multiple instructions should be covered too. And tuples on TON are denoted with square brackets [], so it's better to use those. Also, it's best for readability to remove parentheses as much as possible, so things like "and the 0-th index element (r0)" will be "and the 0-th index element r0" — no need to add indirection and visual pauses with parens :)

@anton-trunov wdyt about #1061 (comment)?

Yeah, we should stick with the stack explanation and use the technical terms in TVM, because I see everyone is confused now :). But the way, a question:

In a function like this:

asm fun testFun(a: Int, b: Int): Result { 
 INS_1
 INS_2 
 .....
 INS_n   // Let us suppose that after INS_n finishes, 
         // there are 5 results in the stack
}

Does Tact know that after executing those instructions, there will be exactly 5 results in the stack?
What happens if struct Result has more than 5 fields? Will Tact pop more than 5 elements from the stack until it fills the struct, having as consequence the popping of elements that are not part of the intended result?

anton-trunov · 2024-11-21T09:04:59Z

Does Tact know that after executing those instructions, there will be exactly 5 results in the stack?

not in the current implementation

Will Tact pop more than 5 elements from the stack until it fills the struct

nope (the consequence of the previous answer)

This should be documented, of course, but an even more important question is "are returned structs actually represented as tensors (multiple TVM values)?"

anton-trunov · 2024-11-21T09:07:22Z

and, of course, the symmetrical question for input function parameters (including passing structs)

anton-trunov · 2024-11-21T09:09:27Z

in any case, each such point should be accompanied by a concrete example of an asm-function

novusnota · 2024-11-21T09:33:09Z

just adapt the corresponding calling convention description from tvm.pdf

Sure, the -> 0 1 is much better explained in terms of s0..s255 stack registers.

Does Tact know that after executing those instructions, there will be exactly 5 results in the stack?

Nope, at the moment it's all handled by FunC, which it turn just passes it to Fift, which does all the work. Neither Tact nor FunC check anything until it's too late and user hits exit code 5, 7, or whatever else.

Also, there could be more things in the stack, only the topmost 5 are of interest if we know that after all the instructions we need 5 values.

This should be documented, of course, but an even more important question is "are returned structs actually represented as tensors (multiple TVM values)?"
and, of course, the symmetrical question for input function parameters (including passing structs)
in any case, each such point should be accompanied by a concrete example of an asm-function

👍

anton-trunov · 2024-11-21T09:46:50Z

@jeshecdom when we have the grammar and AST for our embedded assembly language then we will be able to typecheck asm-functions and warn the user their stack discipline makes sense

jeshecdom · 2024-11-21T09:49:33Z

just adapt the corresponding calling convention description from tvm.pdf

Sure, the -> 0 1 is much better explained in terms of s0..s255 stack registers.

We could do the following with specific examples (previously, we should have explained how function arguments are pushed into the stack):

First, explain how the result tensor is popped from the stack. Suppose after popping, you get some tensor (r0, r1, ..., rn).
Explain how the notation -> n m l is just a re-arrangement of the result tensor.
Explain how the re-arranged tensor is mapped into the result type of the function (this includes explaining how the tensor is mapped into a struct or a primitive type, or whatever).

novusnota · 2024-11-21T10:12:01Z

Eh, Structs are represented as tensors (...), but that's the doing of Tact+FunC. If we were to target TVM directly, we would've dealt with stack entries ourselves, all without tensors.

So I'm a bit hesitant on explaining things in tensors, and instead let's just properly describe stack and stack registers.

jeshecdom · 2024-11-21T10:28:32Z

Eh, Structs are represented as tensors (...), but that's the doing of Tact+FunC. If we were to target TVM directly, we would've dealt with stack entries ourselves, all without tensors.

So I'm a bit hesitant on explaining things in tensors, and instead let's just properly describe stack and stack registers.

Sounds good.

anton-trunov · 2024-11-21T10:41:10Z

So I'm a bit hesitant on explaining things in tensors, and instead let's just properly describe stack and stack registers.

sounds good to me too, since tvm.pdf does not even mention tensors, looks like it's a term coined by the FunC community

And updated the version of Starlight used

novusnota · 2024-11-24T17:40:32Z

Added description of the current Tact-flavored assembly from WIP feat: new asm parser #1064
Described everything from the stack point of view, including its "registers"
Refined the overall top-to-bottom reading flow

jeshecdom · 2024-11-25T10:27:29Z

docs/src/content/docs/book/functions.mdx

+
+:::
+
+### Stack calling conventions {#asm-calling}


The stack calling conventions section reads really nice. Just a couple of questions. What happens if one of the asm function arguments is a struct? What about an argument which is a struct with nested structs? And what happens if the return type is a struct with nested structs? like in this declaration:

struct A { a1: Int; a2: Int; } struct B { b1: Int; b2: A; } asm fun test(s: B, ...): B { ....... }

jeshecdom · 2024-11-25T12:19:56Z

docs/src/content/docs/book/functions.mdx

+// while `self` will be pushed last and get on top of the stack
+asm(c self) extends fun asmStoreDict(self: Builder, c: Cell?): Builder { STDICT }
+
+// Changing the order of return values of LDVARUINT16,


It is still not clear what the notation -> 1 0 means regarding what happens to the results of LDVARUINT16 in the stack itself. The explanation states that 1 represents the value of stack register 1, etc. but it does not explain the significance of writing them in the order -> 1 0. Probably what needs to be said is that the notation -> 1 0 describes how the contents of the stack will be rearranged, when reading -> 1 0 left-to-right: the contents of register s1 will be placed at the top of the stack, and the contents of register s0 will be placed second-to-top.

One alternative way of explaining could be in terms of removing from the stack: -> 1 0 means that s1 is removed first, followed by s0. Hence, the function returns the Builder in s0 because it was the stack content removed last.

Now I am having second thoughts on using "removing" because it becomes confusing with what happens with the rest of the stack. For example, suppose that after executing some asm function with declaration -> 2 1 0, we have the 5 element stack (top is leftmost):

a b c d e

Then, -> 0 1 2 means "remove s0, then s1, then s2", so that the stack after removing s0 is:

b c d e

But then, s0 contains now b, when previously b was in s1.

So, probably a better word instead of "removing" would be "read from":

-> 1 0 means that s1 is read from the stack first, followed by s0. Hence, the function returns the Builder in s0 because it was the stack content read last.

Thing is, as I've just checked in tests, -> 0 1 2 is not about taking or not taking any results, but merely about positioning items for the whatever result type we've specified. Like, if the return type is Int, one can only specify -> 0 and nothing else, even though -> 0 in this case is the same as not writing anything at all. And when the Structs, long Structs (more than 15 entries) or even nested Structs are involved, this is getting complicated.

Thus, my description of s0 matching 0, s1 matching 1 is actually incorrect and has to be rewritten. And I've got to check the cases with long or nested Structs here as well, same as for the "stack calling conventions" bit.

I see. So, this declaration is incorrect (because it returns only one element):

asm(self len -> 1 0) extends fun asmLoadInt(self: Slice, len: Int): Slice { LDIX }

but this is correct:

asm(self len) extends fun asmLoadInt(self: Slice, len: Int): Int { LDIX }

even though it will discard the Slice result and keep only the Int. Or is this last one also incorrect?

Mmmm.... very confusing indeed. So, when using the notation -> m n p it is not possible to discard values in the result type. I think this is acceptable. It is better to explicitly state all the results than to rely on understanding implicit discards.

First one is incorrect. Second one could've been correct if we had our own backend or if we'd alter FunC generation, but since I tested that it's also incorrect — nothing can be discarded in result type.

It worked for me in previous tests mainly because FunC doesn't perform any checks, and because all asm function bodies are embedded in Fift code.

I had some DROP instructions very deep later on in other asm functions, which unexpectedly (for me) cleared the stack for this one. And I noticed that a little too late.

In the end, this really proves the point of those cautionary paragraphs at the top of the assembly functions description. This stuff is really messy, intertwined and hard to debug (until our own backend for it, of course). But I'll persevere.

I understand and thank you for your effort!

So, let's adapt the explanation so that no discards happen in the result type.

Now, regarding nested structs, structs in arguments and structs with more than 15 fields, if you think that the explanation would become too complex to fit it in the page or that the explanation would become so convoluted because of those exceptional cases, probably it would be better to explain those in a separate page, with a link to that page.

novusnota added 5 commits November 19, 2024 03:16

fix(docs): correct function signatures

385c0d0

tact-lang/tact-docs#368

fix(docs): "mutable" -> "mutation" functions

1738dff

tact-lang/tact-docs#374

feat(docs): asm-functions

d5de05d

feat(docs): rewrote the method ID section to remove logical jumps and…

9c4c71b

… make it more streamlined

chore: retroactive CHANGELOG edit

7263c88

novusnota added this to the v1.6.0 milestone Nov 19, 2024

novusnota requested a review from a team as a code owner November 19, 2024 02:22

novusnota changed the title ~~feat(docs): asm-functions~~ feat(docs): asm functions Nov 19, 2024

anton-trunov requested changes Nov 19, 2024

View reviewed changes

jeshecdom reviewed Nov 19, 2024

View reviewed changes

anton-trunov self-assigned this Nov 20, 2024

novusnota and others added 2 commits November 20, 2024 12:21

Update docs/src/content/docs/book/import.mdx

7157a2e

Co-authored-by: Anton Trunov <[email protected]>

fix: apply suggestions from code review

0dd10fe

novusnota requested review from jeshecdom and anton-trunov November 20, 2024 12:21

novusnota added 2 commits November 20, 2024 13:33

typo

558387c

Merge branch 'main' into closes-1011-asm-funs

2070ae3

jeshecdom reviewed Nov 20, 2024

View reviewed changes

fix: adjust descriptions after code review

71a7469

novusnota added 2 commits November 24, 2024 18:29

feat: described the stack, described Tact-flavored assembly

ab143a6

And updated the version of Starlight used

Merge branch 'main' into closes-1011-asm-funs

7e56f85

novusnota requested a review from jeshecdom November 24, 2024 17:38

fix: add note that Tact assembly will be available in v1.6

720ab55

jeshecdom reviewed Nov 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(docs): `asm` functions #1061

feat(docs): `asm` functions #1061

novusnota commented Nov 19, 2024

jeshecdom Nov 20, 2024

novusnota Nov 20, 2024 •

edited

Loading

jeshecdom commented Nov 20, 2024

novusnota commented Nov 20, 2024

anton-trunov commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

jeshecdom commented Nov 21, 2024

jeshecdom commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

anton-trunov commented Nov 21, 2024 •

edited

Loading

anton-trunov commented Nov 21, 2024

novusnota commented Nov 21, 2024 •

edited

Loading

anton-trunov commented Nov 21, 2024

jeshecdom commented Nov 21, 2024

novusnota commented Nov 21, 2024 •

edited

Loading

jeshecdom commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

novusnota commented Nov 24, 2024

jeshecdom Nov 25, 2024

jeshecdom Nov 25, 2024

jeshecdom Nov 25, 2024

novusnota Nov 25, 2024 •

edited

Loading

jeshecdom Nov 25, 2024

novusnota Nov 25, 2024 •

edited

Loading

jeshecdom Nov 25, 2024

feat(docs): asm functions #1061

Are you sure you want to change the base?

feat(docs): asm functions #1061

Conversation

novusnota commented Nov 19, 2024

Issue

Checklist

jeshecdom Nov 20, 2024

Choose a reason for hiding this comment

novusnota Nov 20, 2024 • edited Loading

Choose a reason for hiding this comment

jeshecdom commented Nov 20, 2024

novusnota commented Nov 20, 2024

anton-trunov commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

jeshecdom commented Nov 21, 2024

jeshecdom commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

anton-trunov commented Nov 21, 2024 • edited Loading

anton-trunov commented Nov 21, 2024

novusnota commented Nov 21, 2024 • edited Loading

anton-trunov commented Nov 21, 2024

jeshecdom commented Nov 21, 2024

novusnota commented Nov 21, 2024 • edited Loading

jeshecdom commented Nov 21, 2024

anton-trunov commented Nov 21, 2024

novusnota commented Nov 24, 2024

jeshecdom Nov 25, 2024

Choose a reason for hiding this comment

jeshecdom Nov 25, 2024

Choose a reason for hiding this comment

jeshecdom Nov 25, 2024

Choose a reason for hiding this comment

novusnota Nov 25, 2024 • edited Loading

Choose a reason for hiding this comment

jeshecdom Nov 25, 2024

Choose a reason for hiding this comment

novusnota Nov 25, 2024 • edited Loading

Choose a reason for hiding this comment

jeshecdom Nov 25, 2024

Choose a reason for hiding this comment

feat(docs): `asm` functions #1061

feat(docs): `asm` functions #1061

novusnota Nov 20, 2024 •

edited

Loading

anton-trunov commented Nov 21, 2024 •

edited

Loading

novusnota commented Nov 21, 2024 •

edited

Loading

novusnota commented Nov 21, 2024 •

edited

Loading

novusnota Nov 25, 2024 •

edited

Loading

novusnota Nov 25, 2024 •

edited

Loading