-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(docs): asm
functions
#1061
base: main
Are you sure you want to change the base?
Conversation
… make it more streamlined
Co-authored-by: Anton Trunov <[email protected]>
asm(-> 1 0) extends mutates fun asmLoadCoins(self: Slice): Int { LDVARUINT16 } | ||
// --- | ||
// Notice, that return values are best thought as tuples with indexed access into them | ||
// and not as bottom-up representation of stack values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we need to clarify this in a more detailed example (tutorial-like), because the connection with the stack is very vague and confusing. I can do such detailed example in a separate page, and then I can add a link in here to such page. I can work on that example as a separate PR, since now I am working on writing docs. What do you think? Do you suggest I should use the same TVM instructions you were using in these examples or should I use another TVM instruction that produces at least 3 or more results to exemplify the use of structs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense. If that's a tutorial-like example (for the Cookbook, I presume), then I'd use some other instruction with 3+ results to exemplify the use of Structs, exactly :)
UPD: Well, since my original answer I discovered that unused values are discarded only when the primitive type is specified as the return type. And now the idea of a tutorial makes much less sense to me — it's better that we improve descriptions here and now. But I may be totally wrong
I agree that this is confusing, and I think that the cause is that we are mixing two mental models: a stack (low-level) and tuples (high-level). We should pick only one and stick with it the entire explanation. It seems that describing everything in terms of tuples is more intuitive, but then we should not mention the stack (or mention how tuples are pushed and popped from a stack in a separate subsection, and only in that section mention the stack). So, for example, I would start the examples saying something about the TVM instructions, something like this: " Then, explaining the meaning of a declaration like:
amounts to saying simply: " |
@jeshecdom interesting note. The cases with multiple instructions should be covered too. And tuples on TON are denoted with square brackets @anton-trunov wdyt about #1061 (comment)? |
this is incorrect in a very specific technical sense: a tuple is a TVM data structure that occupy precisely one TVM stack position but can contain multiple other TVM primitives, including tuples the term you probably intended to use is tensor |
I find it confusing |
@novusnota just adapt the corresponding calling convention description from tvm.pdf |
@novusnota you also need to check how structures that are returned from a function are actually encoded |
To fully finish this section, #910 needs to be resolved too |
I meant mathematical tuple, but now I see that this would introduce much more confusion because of the technical terms in TVM. So, the explanation should stick with the stack and use the technical terms in TVM. |
Yeah, we should stick with the stack explanation and use the technical terms in TVM, because I see everyone is confused now :). But the way, a question: In a function like this:
Does Tact know that after executing those instructions, there will be exactly 5 results in the stack? |
not in the current implementation
nope (the consequence of the previous answer) This should be documented, of course, but an even more important question is "are returned structs actually represented as tensors (multiple TVM values)?" |
and, of course, the symmetrical question for input function parameters (including passing structs) |
in any case, each such point should be accompanied by a concrete example of an |
Sure, the
Nope, at the moment it's all handled by FunC, which it turn just passes it to Fift, which does all the work. Neither Tact nor FunC check anything until it's too late and user hits exit code 5, 7, or whatever else. Also, there could be more things in the stack, only the topmost 5 are of interest if we know that after all the instructions we need 5 values.
👍 |
@jeshecdom when we have the grammar and AST for our embedded assembly language then we will be able to typecheck asm-functions and warn the user their stack discipline makes sense |
We could do the following with specific examples (previously, we should have explained how function arguments are pushed into the stack):
|
Eh, Structs are represented as tensors So I'm a bit hesitant on explaining things in tensors, and instead let's just properly describe stack and stack registers. |
Sounds good. |
sounds good to me too, since tvm.pdf does not even mention tensors, looks like it's a term coined by the FunC community |
And updated the version of Starlight used
|
|
||
::: | ||
|
||
### Stack calling conventions {#asm-calling} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stack calling conventions section reads really nice. Just a couple of questions. What happens if one of the asm function arguments is a struct? What about an argument which is a struct with nested structs? And what happens if the return type is a struct with nested structs? like in this declaration:
struct A {
a1: Int;
a2: Int;
}
struct B {
b1: Int;
b2: A;
}
asm fun test(s: B, ...): B
{ ....... }
// while `self` will be pushed last and get on top of the stack | ||
asm(c self) extends fun asmStoreDict(self: Builder, c: Cell?): Builder { STDICT } | ||
|
||
// Changing the order of return values of LDVARUINT16, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still not clear what the notation -> 1 0
means regarding what happens to the results of LDVARUINT16
in the stack itself. The explanation states that 1 represents the value of stack register 1, etc. but it does not explain the significance of writing them in the order -> 1 0
. Probably what needs to be said is that the notation -> 1 0
describes how the contents of the stack will be rearranged, when reading -> 1 0
left-to-right: the contents of register s1
will be placed at the top of the stack, and the contents of register s0
will be placed second-to-top.
One alternative way of explaining could be in terms of removing from the stack: -> 1 0
means that s1
is removed first, followed by s0
. Hence, the function returns the Builder
in s0
because it was the stack content removed last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I am having second thoughts on using "removing" because it becomes confusing with what happens with the rest of the stack. For example, suppose that after executing some asm function with declaration -> 2 1 0
, we have the 5 element stack (top is leftmost):
a b c d e
Then, -> 0 1 2
means "remove s0
, then s1
, then s2
", so that the stack after removing s0
is:
b c d e
But then, s0
contains now b
, when previously b
was in s1
.
So, probably a better word instead of "removing" would be "read from":
-> 1 0
means that s1
is read from the stack first, followed by s0
. Hence, the function returns the Builder
in s0
because it was the stack content read last.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thing is, as I've just checked in tests, -> 0 1 2
is not about taking or not taking any results, but merely about positioning items for the whatever result type we've specified. Like, if the return type is Int
, one can only specify -> 0
and nothing else, even though -> 0
in this case is the same as not writing anything at all. And when the Structs, long Structs (more than 15 entries) or even nested Structs are involved, this is getting complicated.
Thus, my description of s0
matching 0
, s1
matching 1
is actually incorrect and has to be rewritten. And I've got to check the cases with long or nested Structs here as well, same as for the "stack calling conventions" bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So, this declaration is incorrect (because it returns only one element):
asm(self len -> 1 0) extends fun asmLoadInt(self: Slice, len: Int): Slice { LDIX }
but this is correct:
asm(self len) extends fun asmLoadInt(self: Slice, len: Int): Int { LDIX }
even though it will discard the Slice
result and keep only the Int
. Or is this last one also incorrect?
Mmmm.... very confusing indeed. So, when using the notation -> m n p
it is not possible to discard values in the result type. I think this is acceptable. It is better to explicitly state all the results than to rely on understanding implicit discards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First one is incorrect. Second one could've been correct if we had our own backend or if we'd alter FunC generation, but since I tested that it's also incorrect — nothing can be discarded in result type.
It worked for me in previous tests mainly because FunC doesn't perform any checks, and because all asm function bodies are embedded in Fift code.
I had some DROP instructions very deep later on in other asm functions, which unexpectedly (for me) cleared the stack for this one. And I noticed that a little too late.
In the end, this really proves the point of those cautionary paragraphs at the top of the assembly functions description. This stuff is really messy, intertwined and hard to debug (until our own backend for it, of course). But I'll persevere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand and thank you for your effort!
So, let's adapt the explanation so that no discards happen in the result type.
Now, regarding nested structs, structs in arguments and structs with more than 15 fields, if you think that the explanation would become too complex to fit it in the page or that the explanation would become so convoluted because of those exceptional cases, probably it would be better to explain those in a separate page, with a link to that page.
Rewrote the method ID collisions section to remove all logical jumps and make it much more streamlined :)
Also adjusted the structure a little towards the upcoming PR revamping this page. I'll push the draft of it right after we deal with asm functions here.
P.S.: I actually call argument to return position mappings "arrangements" and not "shuffle" as in
grammar.ohm
, because the latter kinda implies randomness, while those are actually deterministic. Hence, "asm arrangments".Issue
asm
functions #1011.tact-docs
)Also, resolved two teeny tiny issues from
tact-docs
— virtually 3-5 lines of fixes for each, no need in a separate CHANGELOG entry:fromSlice
#1092Checklist