Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: tx simulator #1613

Merged
merged 18 commits into from
Apr 20, 2023
Merged

feat: tx simulator #1613

merged 18 commits into from
Apr 20, 2023

Conversation

cmwaters
Copy link
Contributor

@cmwaters cmwaters commented Apr 11, 2023

Ref: #1256, #1535

This PR introduces a new package txsim for contolled fuzz testing at a transaction level. It's purpose is to simulate a wide range of possible user interactions while also being able to apply a considerable load to the network.

How it works

The package is designed in a way to make it easy to compose new fuzzing scenarios, removing the complexity around account nonces and balances, signing, encoding and broadcasting. The basic element is a Sequence. This is an interface that represents a pattern of recursive messages (one could be I send 5 dollars to you and you send 5 dollars back and so forth). Each sequence is provisioned a set of accounts that are responsible for formulating the messages. The accounts are isolated, allowing for a single sequence to be replicated hundreds of times. Sequences are also provided access to the current state as well as a deterministic source of randomness for constructing their output messages.

There are three current sequences:

  • Blob: a single account create random permutations of PayForBlob messages
  • Send: two or more accounts transferring funds to one another
  • Stake: a single account continuously staking, claiming and redelegating

Remaining work:

  • Create a command line version of the tool
  • Explore the need for persistence and garbage collection of accounts
  • Look to brainstorm useful sequences such as goverance proposals and authz.

@cmwaters cmwaters requested a review from evan-forbes as a code owner April 11, 2023 19:23
@MSevey MSevey requested review from a team and mojtaba-esk and removed request for a team April 11, 2023 19:23
@MSevey MSevey requested a review from a team April 11, 2023 19:27
@cmwaters cmwaters self-assigned this Apr 11, 2023
@cmwaters
Copy link
Contributor Author

Looks like there's some data race occurring but I'm not sure if it's actually triggered from the new code I wrote

testing/txsim/run_test.go Show resolved Hide resolved
testing/txsim/run_test.go Show resolved Hide resolved
testing/txsim/run_test.go Show resolved Hide resolved
@MSevey MSevey requested a review from a team April 12, 2023 13:25
Copy link
Member

@evan-forbes evan-forbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly non blocking comments and questions

ideally we can use the testing framework here to replace the integration test (and the smoke test)

I think it would also help to simply clarify the different testing directories, such as testutil, testing, and the test directory in app.

related to
#832 #1162 #829

pollTime time.Duration

mtx sync.Mutex
sequence int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we somehow clarify what type of sequence this is referring to? is this a testing sequence? a cosmos-sdk sequence (nonce)? a local count of successfully submitted txs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, sorry for the misunderstanding. This is simply used to multiplex queries across multiple RPC clients (via round robin)

testing/txsim/client.go Show resolved Hide resolved
Comment on lines 226 to 228
func (tc *TxClient) next() {
tc.sequence = (tc.sequence + 1) % len(tc.rpcClients)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment describing the purpose of this logic. also, in other places we use the mutex before mutating local txclient state (such as the height), do we need to do that here? if not, can we include that in the comment?

sendAmount int
maxHeightDelay int
accounts []types.AccAddress
sequence int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to above, we're using the word sequence for multiple things, which is fine but preferably we document what we mean here

Copy link
Collaborator

@rootulp rootulp Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or rename variable to disambiguate. One candidate rename for the Sequence interface is OperatorGenerator

testing/txsim/sequence.go Show resolved Hide resolved
Comment on lines 252 to 254
func (qc *QueryClient) next() {
qc.sequence = (qc.sequence + 1) % len(qc.connections)
}
Copy link
Member

@evan-forbes evan-forbes Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we document with these that they are relying on the callers to handle mutexes since these are mutating state?

Comment on lines 207 to 210
// increment the sequence number for all the signers
for _, signer := range signers {
am.accounts[signer.String()].Sequence++
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this action need to be protected by a mutex?

Copy link
Contributor Author

@cmwaters cmwaters Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah perhaps. The same accounts should never be used by any two concurrent processes but because there could be concurrent writes to the map, this may panic

testing/txsim/client.go Show resolved Hide resolved

builder.SetFeeAmount(types.NewCoins(types.NewInt64Coin(app.BondDenom, feeAmount)))
// the master account is responsible for paying the fees
builder.SetFeeGranter(am.masterAccount.Address)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a benefit to using fee granter for every tx instead of funding the accounts in some other mechansim (either by including them in the genesis via the existing tooling in testnode or just sending them funds)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, I would like the developer implementing a sequence to not have to worry about fees (they simply state the transactions they want to execute) thus I felt using feegrant to be the best solution. Do you have a problem with feegrant?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, but what is the difference if their account is adequately funded? They still don't have to worry about gas limit or the specifying a fee, since we're doing that here.

I highly highly doubt it, but there might be some weird scenario where all of our tests use feegrant, so we missed some bug that would otherwise occur in a normal scenario.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a valid point. Perhaps it does make the most sense if we leak that complexity to the sequence developers so that they have to over fund their accounts and manage the gas limits (we can hardcode the gas price to the minimum)

testing/txsim/run_test.go Show resolved Hide resolved
@MSevey MSevey requested a review from a team April 13, 2023 09:12
@codecov-commenter
Copy link

codecov-commenter commented Apr 13, 2023

Codecov Report

Merging #1613 (6373b93) into main (952edae) will increase coverage by 1.84%.
The diff coverage is 62.75%.

@@            Coverage Diff             @@
##             main    #1613      +/-   ##
==========================================
+ Coverage   49.05%   50.89%   +1.84%     
==========================================
  Files          85       92       +7     
  Lines        4964     5739     +775     
==========================================
+ Hits         2435     2921     +486     
- Misses       2290     2518     +228     
- Partials      239      300      +61     
Impacted Files Coverage Δ
testutil/testnode/read.go 0.00% <0.00%> (ø)
testing/txsim/run.go 61.81% <61.81%> (ø)
testing/txsim/stake.go 65.21% <65.21%> (ø)
testing/txsim/account.go 67.44% <67.44%> (ø)
testing/txsim/client.go 78.88% <78.88%> (ø)
testing/txsim/blob.go 80.64% <80.64%> (ø)
testing/txsim/send.go 89.65% <89.65%> (ø)
app/estimate_square_size.go 89.09% <100.00%> (ø)
testutil/testnode/full_node.go 82.87% <100.00%> (ø)
testutil/testnode/node_init.go 64.17% <100.00%> (ø)
... and 1 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

evan-forbes
evan-forbes previously approved these changes Apr 13, 2023
Copy link
Member

@evan-forbes evan-forbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍 👍 can't wait to write some sequences!

// set default gas limit to cover the costs of most transactions
// At 0.001 utia per gas, this equates to 1000utia per transaction
// In the future we may want to give some access to the sequence itself
gasLimit = 1000000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to increase this so that we can test blobs larger than ~240 shares, so +1 that having the sequence control this later would be a good idea, or increasing it to about 8 million should be sufficient for the near term

testing/txsim/client.go Show resolved Hide resolved
rootulp
rootulp previously approved these changes Apr 13, 2023
Copy link
Collaborator

@rootulp rootulp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, really cool testing tool

Comment on lines +88 to +91
type Range struct {
Min int
Max int
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +65 to +71
// generate a random namespace for the blob
namespace := make([]byte, ns.NamespaceVersionZeroIDSize)
_, err := rand.Read(namespace)
if err != nil {
return Operation{}, fmt.Errorf("generating random namespace: %w", err)
}
namespaces[i] = ns.MustNewV0(namespace)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional] can use

func RandomBlobNamespace() Namespace {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks but I want to use the random source provided in the method so that by having a specific seed it should always reproduce the same sequence

testing/txsim/blob.go Show resolved Hide resolved
testing/txsim/client.go Outdated Show resolved Hide resolved
@MSevey MSevey requested a review from a team April 14, 2023 08:12
@cmwaters cmwaters dismissed stale reviews from evan-forbes and rootulp via 846d4b1 April 18, 2023 08:24
@cmwaters cmwaters enabled auto-merge (squash) April 19, 2023 13:12
@@ -177,7 +177,7 @@ func (c *Context) FillBlock(squareSize int, accounts []string, broadcastMode str
// in order to get the square size that we want, we need to fill half the
// square minus a few for the tx (see the square estimation logic in
// app/estimate_square_size.go)
shareCount := (squareSize * squareSize / 2) - 1
shareCount := (squareSize * squareSize / 2) - 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[out of curiosity] why did the size of one transaction occupy change from 1 to 2 shares? Did this PR fix a bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this PR fix a bug?

Yes it added math.Ceil to this line because our uint64 conversion was rounding down the number meaning occasionally we estimated too small of a square:

minSize := uint64(math.Ceil(math.Sqrt(float64(totalSharesUsed))))

I was thinking of extracting it out into a separate PR but we're about to overwrite this method entirely with ADR020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants