CIP-???? | Modules in UPLC #946

rjmh · 2024-12-10T10:35:12Z

Cardano scripts are limited in complexity by the fact that each script must be supplied in one transaction, whether the script is supplied in the same transaction in which it is used, or pre-loaded onto the chain for use as a reference script. This limits script code size, which in turn limits the use of libraries in scripts, and ultimately limits the sophistication of Cardano apps, compared to competing blockchains. It is the aspect of Cardano that script developers complain about most.

This CIP addresses this problem directly, by allowing reference inputs to supply 'modules', which can be used from other scripts (including other modules), thus allowing the code of a script to be spread across many reference inputs. The 'main specification' requires no changes to UPLC, PTLC, PIR or Plinth; only a 'dependency resolution' step before scripts are run. Many variations are described for better performance, including some requiring changes to the CEK machine itself.

Higher performance variations will be more expensive to implement; the final choice of variations should take implementation cost into account, and (in some cases) may require extensive benchmarking.

(latest revision rendered from branch)

rphair · 2024-12-10T15:55:43Z

Thanks @rjmh - I'll change the review status to Draft (as formerly reflected in the title) and please let us know when you think it's ready for review and we can mark it Triage for introduction at the following CIP meeting & start tagging more Plutus representatives to go over it (@zliu41 @MicroProofs @michele-nuzzi you may be interested in an advance look).

rjmh · 2024-12-10T16:01:49Z

Hi Robert,Actually, it's pretty complete. We were hoping to get some feedback from the community. There are many possible variations, but choosing between them could benefit from community input.JohnSkickat från min iPhone10 dec. 2024 kl. 16:56 skrev Robert Phair ***@***.***>: Thanks @rjmh - I'll change the review status to Draft (as formerly reflected in the title) and please let us know when you think it's ready for review and we can mark it Triage for introduction at the following CIP meeting & start tagging more Plutus representatives to go over it ***@***.*** @MicroProofs @michele-nuzzi you may be interested in an advance look). —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

zliu41 · 2024-12-10T16:35:12Z

Yes @rphair this is ready for review

fallen-icarus · 2024-12-10T19:28:14Z

CIP-plutus-modules/README.md

+The motivation for these fees is to deter DDoS attacks based on
+supplying very large Plutus scripts that are costly to deserialize,
+but run fast and so incur low execution unit fees. While these fees
+are likely to be reasonable for moderate use of the module system, in
+the longer term they could become prohibitive for more complex
+applications. It may be necessary to revisit this design decision in
+the future. To be successful, the DDoS defence just needs fees to
+become *sufficiently* expensive per byte as the total size of
+reference scripts grows; they do not need to grow without bound. So
+there is scope for rethinking here.


It may be necessary to revisit this design decision in the future.

I don't think this can be left for "future work". I really think it should be updated if necessary when this CIP gets implemented. The reason for this is I don't think DApps should be treated as standalone applications. I think the following example perfectly exemplifies why:

Right now, all stablecoins are not fungible despite them all effectively being the the US dollar. You can't repay a loan in DJED using USDM. If DApps were composable, you could compose a DEX with the lending/borrowing DApp to convert the USDM to DJED in the same transaction where you make the loan payment. DApp composability makes stablecoins fungible!

This isn't possible on account style blockchains because each DApp is individually too expensive. On Cardano, you can compose 10 different DApps in the same transaction. I think this module approach would be huge, but only if it doesn't interfere with DApp composability. AFAIU that means lazy loading is 100% a requirement and users should be able to compose 4-5 DApps in a single transaction even with this module approach. Otherwise, this CIP could end up seriously handicapping the potential of Cardano's DeFi.

I was personally frustrated when I saw there was a hard-cap on the reference script size; if people want to pay up to fit more DApps into the transaction, let them! I'm fine with the cost being exponential after a certain point (ideally after 4-5 DApps in the transaction), but the hard limit doesn't make sense to me as long as the user pays for it. The adr linked to doesn't give any justification for the hard limit aside from "further increase the resilience". This CIP could easily exacerbate the issues with the reference script fee calculation.

I agree it's going to be necessary. I just don't think it's a prerequisite... so modules should not be held up waiting for this. They'll be useful even without a change to reference script fees--just not as useful. I realise there are other factors to consider in fee-setting, but adding modules should raise the priority of fixing those fees considerably.

KtorZ · 2024-12-11T09:47:39Z

CIP-plutus-modules/README.md

+for use as a reference script. This limits script code size, which in
+turn limits the use of libraries in scripts, and ultimately limits the
+sophistication of Cardano apps, compared to competing blockchains. It
+is the aspect of Cardano that script developers complain about most.


It is the aspect of Cardano that script developers complain about most.

Seems a bit arbitrary as a statement 😅 ... I have seldom heard people complaining about that. Rather, people complain about the script size which they often max out in their on-chain scripts without even bringing in dependencies.

See also:

https://cardano-foundation.github.io/state-of-the-developer-ecosystem/2024/#what-do-you-think-is-the-biggest-pain-point-of-cardanos-developer-ecosystem

https://cardano-foundation.github.io/state-of-the-developer-ecosystem/2023/#what-do-you-think-is-the-most-painful-point-of-cardanos-developer-ecosystem

Thanks--I took this from a meeting, but the claim seems to be exaggerated. I will weaken the language. Sounds like you agree that complaints about the script size limit are common though.

I disagree here. Prior to the introduction of reference scripts, complaints about size were common, now with the withdraw-zero trick / other forwarding logic scripts, and reference scripts, script size is not really an issue, in-fact most dApps happily accept increased script size for reduced ex-units (more aggressive inlining / manual recursion unrolling / lookup tables).

I do agree that regardless of whether or not script size restraints are still a pain point, modules are still valuable.

KtorZ · 2024-12-11T09:54:36Z

CIP-plutus-modules/README.md

+the others provide supporting code of one sort or another. Thus the
+software engineering benefits of a module system are already
+available; other languages compiled to UPLC could provide a module
+system in a similar way. The *disadvantage* of this approach is that


Thus the software engineering benefits of a module system are already
available; other languages compiled to UPLC could provide a module
system in a similar way.

I don't think there's a single Plutus language framework today that doesn't support modules.

https://aiken-lang.org/language-tour/modules

https://www.hyperion-bt.org/helios-book/lang/modules.html

Opshin piggybacks on Python's module system, Plu-ts on TypeScript's, Scalus on Scala's and Plutarch on Haskell's.

Although for all those languages, the concept of modules exists at compile-time only, whereas I believe this CIP is about bringing this concept at runtime to have dynamic resolution. Perhaps a parallel/analogy with statically linked vs dynamically linked dependencies is worth highlighting to make that clearer? Today, every module is very much statically bundled with scripts unless work is explicitly done to split them in separate validators.

(edit: now read the sections further down and I see that (1) this points is made indeed and (2) that the approach suggested in this CIP is still closer to a static linking done by the ledger prior to execution -- so, semi-dynamic 😅 ?).

Yeah, the choice of terminology can be a bit confusing and could be made more precise. The term "static/dynamic linking" is being used to refer to two different things:

You are saying: static linking = status quo where each script is a monolith, (semi-)dynamic linking = what this CIP proposes

whereas there's a subsection "Static vs Dynamic Linking" in the CIP, where static linking = a module specifies its dependency hashes, and dynamic linking = it doesn't specify them.

I will make it clear that many languages already support modules, not just Plutus/Haskell. But with the limitation that all the code ends up in one script, and so is subject to the script size limit.

KtorZ · 2024-12-11T10:08:05Z

CIP-plutus-modules/README.md

+        lookupArg (ScriptArg hash) = do
+          script <- lookup hash preimages
+          go script
+```


Hmm. This suggests that either the module resolution happens at compile time (which would void the benefits of having modules to begin with) or, actually done by the ledger itself executing scripts. So my understanding leans towards the later, which leads to the follow-up question: are you suggesting that the ledger becomes aware of scripts dependencies? And if so, by which means shall transaction communicate this intent to the ledger?

At the moment, scripts are fundamentally already parameterized by a single parameter (two or three in PlutusV1 & PlutusV2); A validator has a signature that's roughly Data -> Validator. So I don't find it completely unreasonable to ask the ledger to now also apply some dependencies to the scripts in addition to the datum/redeemer & script context. Though it's unclear at this point how to signal that and how is this being cost (will keep reading 👀).

are you suggesting that the ledger becomes aware of scripts dependencies? And if so, by which means shall transaction communicate this intent to the ledger?

Yes. a serialised script is deserialised into either a complete script with no dependency, or a script plus a list of dependencies, and in the latter case the ledger will need to retrieve those dependencies and link them together to form a complete script.

Exactly. I clarified that this happens during phase 2 verification, and that scripts on the chain are represented in this form, with dependencies just in the form of hashes.

KtorZ · 2024-12-11T10:25:53Z

CIP-plutus-modules/README.md

+The goal of this variation is to eliminate the cost of evaluating
+scripts, by converting them directly to values. Since UPLC runs on the
+CEK machine, this means converting them directly into the `CekValue` type,
+*without* any CEK machine execution. To make this possible, the syntax


I'd argue that it doesn't eliminate the cost of evaluating scripts, but rather, it becomes someone's else problem 😄! That someone here being, the ledger/node indirectly which now has to do more (un-budgeted) work for free. I believe one of the fundamental design choice of Plutus was to have most of the decoding / conversion operations happen as part of the CEK evaluation so that they can be properly cost and paid for.

Otherwise, I'd argue that instead of providing Data arguments to scripts, we might as well provide pre-computed sum-of-products. But that means the cost of decoding the script context is now not paid for by execution units so has to be acknowledged through different means.

(to be clear, I am not against the idea! It seems like a reasonable ask to me, but I recall past conversations with the Plutus core team about it and why it is generally not deem as a viable option).

This is an inexpensive operation that takes in the worst case linear time (and in some variants it is probably always constant time), so I think it's reasonable to consider it covered by the reference script fee, which is already an over-estimation of the script deserialization cost.

Right, it's linear time in the size of the top-level of scripts--one traversal over the code which need not descend inside values at the top level of a module. So reasonable to cover it from the reference script fee.

KtorZ · 2024-12-11T10:35:57Z

CIP-plutus-modules/README.md

+transitions. The conversion can be done *once* for a whole
+transaction, sharing the cost between several scripts if they share


The conversion can be done once for a whole transaction

That's a good point, and also strengthen the idea that more of these transformations would be better off happening in the ledger as pre-processing instead of directly within the CEK evaluation.

Although in that particular case, it probably depends on the redeemer value too. If we assume a partial resolution like what you mention in Lazy Loading, then the traversal could likely yield different applications for the same script based on which redeemer is being used. Though, for the same inputs, this is certainly a reasonable expectation. It's unclear to me whether there would many "cache hit" in practice.

Another important point that supports this thought is how developers end up often structuring their scripts by mutualizing similar chunks of logic under validator purposes that execute only once per transaction. So a typical structure we see on-chain are trivial spending validators that defer their validation to a single withdraw validator; then forcing a 0-Ada withdrawal on a registered stake credential. Since validators have access to the entire transaction script context, it's always possible to have a validator guarding the 0-Ada withdrawal to execute and validate each input in a single pass; rather than re-doing work for every single input.

See for details: https://github.com/Anastasia-Labs/design-patterns/blob/main/stake-validator/STAKE-VALIDATOR.md#stake-validator-design-pattern

Different redeemers may indeed result in different modules being required to be present - but I don't think this poses any problem, does it?

Your second point I think is the same as the "Merkelized Validators" discussed in the related work.

@KtorZ

The design patterns repo has a separate readme specifically for the withdraw zero trick,

https://github.com/Anastasia-Labs/design-patterns/blob/main/stake-validator/STAKE-VALIDATOR-TRICK.md

There was a section on "Merkelized validators" that discusses this; I have added links to the stake-validator trick directly to that section. I also made the discussion there a little more explicit: it's a great trick for sharing work between validators, which is useful with-or-without the modules discussed in this CIP--so it's not replaced by this CIP; but as a way of implementing modules it is intricate and unsatisfactory.

Re "cache hits", they will occur when different modules in the dependency tree depend in turn on the same module. So a module containing basic definitions for an application, and used in many parts of it, would fall into that category. So would a commonly-used library that many modules (in the same application) might depend on. I'm expecting to see quite a lot of this.

Where 'lazy loading' is concerned, note that it is the particular transaction that decides which dependencies to supply. Yes indeed, the dependencies needed will vary depending on the redeemer value. That's what we want to take advantage of--that in a particular transaction, we know what the redeemer value is, and so we can decide to omit modules that are not going to be needed. Dangling pointers ftw! (As long as they're not going to be used).

KtorZ · 2024-12-11T10:40:30Z

CIP-plutus-modules/README.md

+using the SoP extension (CIP-85) as `constr 0 x1...xn`, but the only
+way to select the `i`th component is using
+```
+  case t of (constr 0 x1...xi...xn) -> xi
+```
+which takes time linear in the size of the tuple to execute, because
+all `n` components need to be extracted from the tuple and passed to
+the case branch (represented by a function).


Such Tuples could also be represented as pairs of pairs and bring this cost down to log2(size) steps ?

Yes that would be logn case terms, cheaper in terms of execution units (at least for long tuples) but bigger in script size.

Logarithmic is better than linear, but it's also the cost of accessing variables in the environment (which is logarithmic in the size of the environment). So the advantage of putting the module exports into one tuple instead of bunging them all into the environment would disappear. Much better to bite the bullet and put in explicit projections, getting constant time access.

KtorZ · 2024-12-11T10:47:09Z

CIP-plutus-modules/README.md

+Currently, the definition of “script” used by the ledger is (approximately):
+```
+newtype Script = Script ShortByteString
+```


I think it's worth mentioning that we cannot actually publish arbitrary CEK Term as scripts but only UPLC Program (which are wrapped Term with versioning metadata).

The ledger enforces that all published scripts (in reference or witness) have this Program envelope. So it might be worth defining a new type of envelope for Modules. This would also allow to distinguish modules on-chain from actual validators scripts which may be handy shall we need to apply further restriction from the ledger regarding those (since as outlined below, it is incumbent upon the ledger to manage those dependencies and pre-process them on the behalf of validators.

I don't think this CIP is proposing publishing CEk terms as scripts. As to distinguishing validators vs. modules, the Script data type defined in "Subvariation: Unboxed modules" allows for it.

Right, the CEK values exist only during phase 2 validation; they are never stored on the chain. And as Ziyang says, the 'unboxed modules' subvariation does distinguish module scripts from validators, primarily because (in that variation) they are subject to different syntactic restrictions. So if the deserializer is going to check those, then it needs to know what kind of script it is deserializing.

KtorZ · 2024-12-11T10:47:57Z

CIP-plutus-modules/README.md

+the `Script` type accordingly
+```
+data Script = ValidatorScript         CompiledCode [ScriptArg]
+            | ModuleScript            CompiledCode [ScriptArg]


Ah! This seems to echo my previous comment about making a distinction (which distinction shall prevail onto the serialisation to be any useful IMO).

KtorZ · 2024-12-11T10:52:30Z

CIP-plutus-modules/README.md

+Currently each script on-chain is tagged with a specific ledger language version - V1, V2, V3 or native script - and this version tag is a component of the script hash.
+A logical approach, therefore, is to continue doing so for module scripts, and require that a validator script and all modules it references must use the same ledger language version; failure to do so leads to a phase-1 error.
+
+A different approach is to distinguish between validator scripts and module scripts by applying version tags only to validator scripts.
+Module scripts are untagged and can be linked to any validator script.
+This makes module scripts more reusable, which is advantageous because in most cases, a UPLC program has the same semantics regardless of the ledger language version.


I am not sure that the second approach is sound; because the version not only defines the interface to the validator, but also:

Which Plutus builtins are actually available

The semantic of some of those builtins

The costing functions of those builtins

For example, in Plutus V1/V2, cons_bytestring(256, bytes) is equivalent to cons_bytestring(0, bytes) (the runtime performs a free modulo 255), but in PlutusV3, it results in an out-of-bound error. That's the case for a few other builtins which have subtle semantic changes. (Technically, the semantic is bound to the Program version -- 1.0.0 vs 1.1.0 --, but this one is tightly coupled to the language version and I am taking a slight shortcut here).

So I'd argue that to keep everyone's life easier, enforcing the same "language version" across modules and validators is a fairly reasonable ask.

Yes, this is the point I made in the next paragraph. I think we'll most likely go with the first approach, i.e., tagged modules.

I prefer that option too--allowing different language versions here would impose a constraint on all future language versions, which feels error-prone and uncomfortable.

Are the semantic changes of builtin functions all documented in the changelog or anywhere?

colll78 · 2024-12-13T06:54:40Z

CIP-plutus-modules/README.md

+Currently each script on-chain is tagged with a specific ledger language version - V1, V2, V3 or native script - and this version tag is a component of the script hash.
+A logical approach, therefore, is to continue doing so for module scripts, and require that a validator script and all modules it references must use the same ledger language version; failure to do so leads to a phase-1 error.
+
+A different approach is to distinguish between validator scripts and module scripts by applying version tags only to validator scripts.
+Module scripts are untagged and can be linked to any validator script.
+This makes module scripts more reusable, which is advantageous because in most cases, a UPLC program has the same semantics regardless of the ledger language version.


Are the semantic changes of builtin functions all documented in the changelog or anywhere?

colll78 · 2024-12-13T07:10:31Z

CIP-plutus-modules/README.md

+Note that, on Ethereum, a proxy contract can be updated without
+changing its contract address---thanks to mutable state. On Cardano, a
+script address *is* the hash of its code; of course, changing the code
+will change the script address. It is very hard to see how that could
+possibly be changed without a fundamental redesign of Cardano. So the
+methods discussed below are different in nature from the Ethereum one:


The exact same thing is true on Cardano. You can easily create proxy contracts that can be updated without changing its contract address.

mkProxyContract :: ClosedTerm (PAsData PCurrencySymbol :--> PScriptContext :--> PUnit) mkProxyContract = plam $ \protocolParamsCS ctx -> P.do ctxF <- pletFields @'["txInfo", "redeemer", "scriptInfo"] ctx infoF <- pletFields @'["inputs", "referenceInputs", "outputs", "signatories", "wdrl"] ctxF.txInfo referenceInputs <- plet $ pfromData infoF.referenceInputs -- Extract protocol parameter UTxO ptraceInfo "Extracting protocol parameter UTxO" let paramUTxO = pfield @"resolved" #$ pmustFind @PBuiltinList # plam (\txIn -> let resolvedIn = pfield @"resolved" # txIn in phasDataCS # protocolParamsCS # (pfield @"value" # resolvedIn) ) # referenceInputs POutputDatum ((pfield @"outputDatum" #) -> paramDat') <- pmatch $ pfield @"datum" # paramUTxO forwardToScriptHash <- plet $ punsafeCoerce @_ @_ @(PAsData PByteString) (pto paramDat') let invokedScripts = pmap @PBuiltinList # plam (\wdrlPair -> let cred = pfstBuiltin # wdrlPair in punsafeCoerce @_ @_ @(PAsData PByteString) $ phead #$ psndBuiltin #$ pasConstr # pforgetData cred ) # pto (pfromData infoF.wdrl) pif (pelem # forwardToScriptHash # invokedScripts) (pconstant ()) perror

The above script is a proxy contract which is parameterized by a state token (an NFT) which authenticates a UTxO that contains the script hash that this proxy forwards validation to (via the withdraw-zero trick). If that UTxO lives at a user's wallet, they can update the proxy contract by spending it back to the same address and changing the datum to be a different script hash. If the UTxO lives at a script, then the script logic will validate any update.

That being said, I would caution that this section on upgradability should be removed altogether.
DApp upgradability is already a security nightmare, it’s very hard to support it without completely sacrificing decentralization. You need to use an onchain governance protocol, like Agora, except these protocols are very experimental on Cardano, so much so that even the creators of Agora do not use it for governance of their protocol.

I think the advice in the CIP regarding how upgradability can be achieved is quite dangerous given how many exploits “upgrade keys” being compromised has led to in Ethereum / Solana, and generally out of scope of this proposal.

Draft CIP on an extension to add modules to UPLC

f859a0d

rphair added the Category: Plutus Proposals belonging to the 'Plutus' category. label Dec 10, 2024

rphair changed the title ~~Draft CIP on an extension to add modules to UPLC~~ CIP-???? | Modules in UPLC Dec 10, 2024

rphair marked this pull request as draft December 10, 2024 15:55

rphair marked this pull request as ready for review December 10, 2024 17:10

rphair added the State: Triage Applied to new PR afer editor cleanup on GitHub, pending CIP meeting introduction. label Dec 10, 2024

fallen-icarus reviewed Dec 10, 2024

View reviewed changes

KtorZ reviewed Dec 11, 2024

View reviewed changes

Responses to feedback

a8c7d5e

colll78 reviewed Dec 13, 2024

View reviewed changes

		transitions. The conversion can be done once for a whole
		transaction, sharing the cost between several scripts if they share

CIP-???? | Modules in UPLC #946

Are you sure you want to change the base?

CIP-???? | Modules in UPLC #946

Conversation

rjmh commented Dec 10, 2024 • edited by rphair Loading

rphair commented Dec 10, 2024

rjmh commented Dec 10, 2024 via email

zliu41 commented Dec 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rjmh Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zliu41 Dec 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rjmh commented Dec 10, 2024 •

edited by rphair

Loading

rjmh Dec 12, 2024 •

edited

Loading

zliu41 Dec 11, 2024 •

edited

Loading