-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve VmInstantiation
calibration
#825
Conversation
Plotting a chart from my local run for VMInstantiation shows that CPU instruction is linear but memory seems like a constant. |
Hm yeah I thought this was fairly .. linear actually! I mean, concretely it definitely does have to do work that's at least linear in contract size (in the sense that it has to parse the wasm). Can you not use the synth-wasm subcrate to get a clearly varying set of samples? |
Thanks for the plot @anupsdf. |
Okay, I was definitely wrong about the weak linear dependency. First experiment (contract with a single function having n instructions)
Second experiment (contract with n empty functions)
I'm leaning towards using the first experiment as the calibration target. As I feel real contracts should have somewhat limited number of internal functions. The resulting linear param in the second experiment maybe too large and too panelizing. |
VmInstantiation
cost type from linear to constantVmInstantiation
calibration
Updated the PR and the comments. Ready for review again. |
Hello! Wanted to chime in as I've just started benchmarking the Blend protocol to start looking for optimizations. I have some concerns about how much weight the The main culprit is a custom token (tracks collateral or liabilities for an underlying token, protocol calls From what I can quickly grasp from this PR - it looks like I will expect costs to increase substantially in this department, from Happy to share more information if it would be useful. |
@jayz22 The second, larger bound should be used -- we're aiming for an upper bound. Users can submit a malicious contract. @mootz12 Yes, cross-contract calls are for the time being the largest cost centre, by a fairly wide margin, and they're likely to always be so. We're likely to bring down the magnitude of difference in the future by caching a certain amount of material in the VM, making instantiation avoid re-parsing wasms from one call to the next, but we haven't put any time into this yet, and will only be able to do so much. The rest of the system is fairly efficient, so by comparison instantiation of each VM is high-cost. |
@graydon I've updated it to use the second, larger bound. @mootz12 Thanks for your feedback. It is unfortunately VM instantiation is as high as it is now, due to the work necessary to set up the VM and parsing the wasm contracts. The metered cost parameters are just reflection of the calibrated costs (and we have to take the upper bound to prevent malicious contracts). We may have improvements and optimizations for it in the future (created #827 to track it). |
Do we have any understanding on what actually drives these numbers up so much and why do we have so much variance? E.g. in Anup's data we have linear param at around 230 and in 2 Jay's experiments we have values from 100 to 600. Either way the number is much higher than the cost of interpreting a Wasm instruction, which seems really weird to me. Does VM do some heavy pre-processing? Is something off about our measurement methodology? |
Please see the comment above #825 (comment) |
Sure, I understand that these are two or even three different setups. What I don't understand is why do we observe such a drastic difference and whether we are using the proper value as the input size (it seems to me like we are not). |
Well, in this case (and most cases) there is no perfect single input to capture all the degree of variations. Instantiating an VM requires much work including parsing and validating the Wasm file, initializing linear memory, linking host functions etc. |
That's not necessarily the only answer - we have other options, such as changing what the input value is or breaking down the operation into multiple linear components dependent on different values.
If we believe that the number of functions is what drives the cost, then why don't we use it as the input parameter to metering? This of course creates a 'loop' for the current model (because you need to load the contract to count the functions), but there are ways around that, such as:
I'm not sure that's the best solution:
One potential way to exercise this would be to create two sets of benchmarks:
I think both can be generated with a bit of proc macros. Another thing that potential might matter is the number of imported host functions. I'm not sure if we're covering that now - it might or might not have an impact (again, even the 687 coefficient might not necessarily be enough). Would be nice to also cover contracts that import as many host fns as possible. |
Multi-dimensional inputs were considered (see #208) but we decided it was not worth the complexity at the time. And we will be facing the same challenges of picking the number of degrees of freedom, as well as deciding which inputs are worthy. Not to mention the calibration across multi-dimensional inputs will be challenging in terms of computation and accuracy. In this case one can simply argue the contract can do other wild things to make it expensive to parse but we need to make an assumption and draw a line, and I think the many-simple-local-functions approach is a decent one.
If we think there is another clear-cut worse case we would consider using that.
This might help but again not sure this is worth the extra complexity and not sure this solves the problem, it doesn't eliminate the heuristic of deciding how many local functions is "too many" that needs to fall into one bucket vs another.
All of the host functions are linked into Wasmi for any contract. That's is necessary for Wasmi to resolve a particular host function for a contract to call. As @graydon mentioned above and in the thread. We have other more systematic ways to make the VM instantiation cheaper, e.g. sharing modules, caching parsed contracts which may reduce the cost fundamentally. But in any case the cost formula will have to exhibit the worst case estimate in some way. |
There are a lot of degrees of freedom in a wasm contract. We're not going to capture all of them and we've no strong reason to believe that "large number of empty functions" is the worst case use of bytes in terms of incurring wasmi costs. Merely that it's worse than other better cases. If we find even worse cases, we have to either find a way to dynamically limit or prohibit them, or integrate them into the cost model as well, as a user can DoS us with a txset full of instances of contracts that exhibit the worst case. CC'ign @brson on this, he might have some fun trying to wire up synth-wasm (or some other wasm fuzzer, eg. wasm-smith) to find "the most expensive thing we can ask wasmi to do during parsing and validation". |
I'm not necessarily suggesting to use multi-dimensional inputs. We could just have two linear functions (in case if the runtime is a linear function of the inputs of course), e.g. 'vm parse (contract size)', 'vm instantiate (function count)' or something like that. This may or may not be the right model, but we won't know if we don't try.
It does have a potential to solve up 5x discrepancy between the cases.
What I'm proposing is to do more rigorous benchmarking to figure that out.
That's not exactly true; the contract itself has to declare the host fns it's using. I don't know if this does or doesn't have impact on the performance, which is why that's just one more idea of what could be measured.
Sure, so shouldn't we think more as to what might be the worse case? But in any case, if there is a straightforward parameter of the model (like the function number), shouldn't we just use it? If linear combination of contract size and function number gives us a model that has, say, just <2x fluctuation between best and worst cases we find (instead of 3-5x), then I'd say that would be huge win for the ecosystem. |
I've done a few experiments and did not find a way to further improve the current metering model. Given our current constrains on time and resources, we should proceed with merging this PR, and adjust the resource limits at the network config level. |
What
Resolves #821:
VmInstantiation
.The main diff is in the
VmInstantiation
costsWhy
VmInstantiation
cost depends linearly w.r.t contract size, and has been calibrated as a linear model.Previously we were using the
soroban-test-wasm
contracts which were few, as a result, the linear parameter was poorly calibrated (R^2 around 0.7). As a result we manually overrode the calibrated parameters with a constant ones.This PR uses synthesized Wasm so that the size of the contracts can be varied and controlled. The resulting R^2 score is > 0.99.
Known limitations
During calibration, it was realized the precision of metering is compromised by the integer representation of the model params (especially the linear one that are close to 0). They should probably be represented as fixed-point integers. Created this followup issue: #824