Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Refactor PrepareProposal to produce blocks using the non-interactive defaults #692

Merged
merged 46 commits into from
Sep 20, 2022
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
51f1e57
chore: move malleated transaction decoder to the encoding package
evan-forbes Sep 7, 2022
8e45369
Merge branch 'main' into evan/NID-PrepareProposalRefactor
evan-forbes Sep 7, 2022
4ec82ae
feat: refactor prepareProposal to use NID
evan-forbes Sep 7, 2022
766306a
chore: refactor SplitMessages to follow ADR003
evan-forbes Sep 7, 2022
77f2f28
chore: add test for overestimating malleated transaction sizes
evan-forbes Sep 8, 2022
29127ba
chore: clarify comment
evan-forbes Sep 8, 2022
9257ff9
chore: linter
evan-forbes Sep 8, 2022
e23b689
chore: modify test to work with new PrepareProposal
evan-forbes Sep 8, 2022
ef443ad
fix: add notes and debug edge case for pruning
evan-forbes Sep 8, 2022
3fd5f68
chore: move TestSuite func to top of file for easy access
evan-forbes Sep 8, 2022
5e024e4
chore: add fuzzy test for PrepareProcessProposal
evan-forbes Sep 8, 2022
388891c
fix: change a few parameters to reduce number of flakey tests where m…
evan-forbes Sep 8, 2022
8e87a05
feat: export additional testutil function
evan-forbes Sep 8, 2022
402126d
chore: remove todo
evan-forbes Sep 8, 2022
fea19f4
fix test
evan-forbes Sep 8, 2022
de15c44
fix: tests for processproposal
evan-forbes Sep 8, 2022
42b365b
fix: remove unused tests
evan-forbes Sep 8, 2022
6895ace
fix: linter
evan-forbes Sep 8, 2022
25ac91b
add docs for square size
evan-forbes Sep 9, 2022
49f50b8
use correct number in docs for test
evan-forbes Sep 9, 2022
f422e70
spelling
evan-forbes Sep 9, 2022
e93ea4b
use docs for estimated square size
evan-forbes Sep 9, 2022
c4d1b99
wording
evan-forbes Sep 9, 2022
bf97477
spelling
evan-forbes Sep 9, 2022
e62f91f
panic if error
evan-forbes Sep 9, 2022
7a1b6d4
Merge branch 'evan/NID-PrepareProposalRefactor' of https://github.com…
evan-forbes Sep 9, 2022
324966d
refactor docs for overestimateMalleatedTxSize
evan-forbes Sep 9, 2022
9c0aa58
remove extra 2 tx shares for no reason
evan-forbes Sep 9, 2022
cbda052
typo
evan-forbes Sep 9, 2022
a642276
spelling
evan-forbes Sep 9, 2022
92f32fc
rename test
evan-forbes Sep 9, 2022
2623c93
better wording
evan-forbes Sep 9, 2022
7101a7b
spelling
evan-forbes Sep 9, 2022
79f378a
Merge branch 'main' into evan/NID-PrepareProposalRefactor
evan-forbes Sep 9, 2022
9c38323
Apply suggestions from code review
evan-forbes Sep 9, 2022
50baf90
add clarifying comment
evan-forbes Sep 9, 2022
efb41c1
Apply suggestions from code review
evan-forbes Sep 12, 2022
5b1eb21
fix: use non-interactive instead NI in function name
evan-forbes Sep 12, 2022
eb8d8de
Merge branch 'main' into evan/NID-PrepareProposalRefactor
evan-forbes Sep 12, 2022
750e31c
fix: missing function name changes
evan-forbes Sep 12, 2022
7279a0d
refactor: use constants instead of numbers
evan-forbes Sep 12, 2022
2dce08a
Merge branch 'main' into evan/NID-PrepareProposalRefactor
evan-forbes Sep 16, 2022
4484a86
fix: doc typo
evan-forbes Sep 16, 2022
776e013
Merge branch 'main' into evan/NID-PrepareProposalRefactor
evan-forbes Sep 19, 2022
228bb9e
fix!: make splitting shares using indexes for transactions optional.
evan-forbes Sep 19, 2022
7d2ee33
test: comment out flaky tx inclusion tx
evan-forbes Sep 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion app/app.go
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ func New(
cdc := encodingConfig.Amino
interfaceRegistry := encodingConfig.InterfaceRegistry

bApp := baseapp.NewBaseApp(Name, logger, db, MalleatedTxDecoder(encodingConfig.TxConfig.TxDecoder()), baseAppOptions...)
bApp := baseapp.NewBaseApp(Name, logger, db, encoding.MalleatedTxDecoder(encodingConfig.TxConfig.TxDecoder()), baseAppOptions...)
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
bApp.SetCommitMultiStoreTracer(traceStore)
bApp.SetVersion(version.Version)
bApp.SetInterfaceRegistry(interfaceRegistry)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package app
package encoding

import (
sdk "github.com/cosmos/cosmos-sdk/types"
Expand Down
242 changes: 242 additions & 0 deletions app/estimate_square_size.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
package app

import (
"bytes"
"math"
"sort"

"github.com/celestiaorg/celestia-app/pkg/appconsts"
"github.com/celestiaorg/celestia-app/pkg/shares"
"github.com/cosmos/cosmos-sdk/client"
"github.com/tendermint/tendermint/pkg/consts"
core "github.com/tendermint/tendermint/proto/tendermint/types"
coretypes "github.com/tendermint/tendermint/types"
)

// prune removes txs until the set of txs will fit in the square of size
// squareSize. It assumes that the currentShareCount is accurate. This function
// is far from optimal becuse accurately knowing how many shares any given
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
// set of transactions and its message takes up in a data square that is following the
// non-interactive default rules requires recalculating the entire square.
// TODO: include the padding used by each msg when counting removed shares
func prune(txConf client.TxConfig, txs []*parsedTx, currentShareCount, squareSize int) parsedTxs {
maxShares := squareSize * squareSize
if maxShares >= currentShareCount {
return txs
}
goal := currentShareCount - maxShares

removedContiguousShares := 0
contigBytesCursor := 0
Comment on lines +28 to +29
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[follow-up PR] we may replace the contiguous terminology with compact

removedMessageShares := 0
removedTxs := 0

// adjustContigCursor checks if enough contiguous bytes have been removed
// inorder to tally total contiguous shares removed
adjustContigCursor := func(l int) {
contigBytesCursor += l + shares.DelimLen(uint64(l))
if contigBytesCursor >= consts.TxShareSize {
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
removedContiguousShares += (contigBytesCursor / consts.TxShareSize)
contigBytesCursor = contigBytesCursor % consts.TxShareSize
}
}

for i := len(txs) - 1; (removedContiguousShares + removedMessageShares) < goal; i-- {
// this normally doesn't happen, but since we don't calculate the number
// of padded shares also being removed, its possible to reach this value
// should there be many small messages, and we don't want to panic.
if i < 0 {
break
}
removedTxs++
if txs[i].msg == nil {
adjustContigCursor(len(txs[i].rawTx))
continue
}

removedMessageShares += shares.MsgSharesUsed(len(txs[i].msg.GetMessage()))
// we ignore the error here, as if there is an error malleating the tx,
// then it we need to remove it anyway and will not end up contributing
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
// bytes to the square anyway.
_ = txs[i].malleate(txConf, uint64(squareSize))
adjustContigCursor(len(txs[i].malleatedTx) + appconsts.MalleatedTxBytes)
}

return txs[:len(txs)-(removedTxs)]
}

// calculateCompactShareCount calculates the exact number of compact shares used.
func calculateCompactShareCount(txs []*parsedTx, evd core.EvidenceList, squareSize int) int {
txSplitter, evdSplitter := shares.NewContiguousShareSplitter(consts.TxNamespaceID), shares.NewContiguousShareSplitter(consts.EvidenceNamespaceID)
var err error
msgSharesCursor := len(txs)
for _, tx := range txs {
rawTx := tx.rawTx
if tx.malleatedTx != nil {
rawTx, err = coretypes.WrapMalleatedTx(tx.originalHash(), uint32(msgSharesCursor), tx.malleatedTx)
// we should never get to this point, but just in case we do, we
// catch the error here on purpose as we want to ignore txs that are
// invalid (cannot be wrapped)
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
if err != nil {
continue
}
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
used, _ := shares.MsgSharesUsedNIDefaults(msgSharesCursor, squareSize, tx.msg.Size())
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
msgSharesCursor += used
}
txSplitter.WriteTx(rawTx)
}
for _, e := range evd.Evidence {
evidence, err := coretypes.EvidenceFromProto(&e)
if err != nil {
panic(err)
}
err = evdSplitter.WriteEvidence(evidence)
if err != nil {
panic(err)
}
}
txCount, available := txSplitter.Count()
if consts.TxShareSize-available > 0 {
txCount++
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[no change needed b/c not modified in this PR][potential refactor] this logic seems confusing because it isn't congruent with the doc comment for tsSplitter.Count(). I would expect txCount to account for an additional share if the pending share isn't full but has some bytes in it.

Should the rounding up occur inside Count()?

// Count returns the current number of shares that will be made if exporting.
func (csw *ContiguousShareSplitter) Count() (count, availableBytes int) {
if len(csw.pendingShare.Share) > consts.NamespaceSize {
return len(csw.shares), 0
}
availableBytes = consts.TxShareSize - (len(csw.pendingShare.Share) - consts.NamespaceSize)
return len(csw.shares), availableBytes
}

evdCount, available := evdSplitter.Count()
if consts.TxShareSize-available > 0 {
evdCount++
}
return txCount + evdCount
}

// estimateSquareSize uses the provided block data to estimate the square size
// assuming that all malleated txs follow the non interactive default rules.
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
func estimateSquareSize(txs []*parsedTx, evd core.EvidenceList) (uint64, int) {
// get the raw count of shares taken by each type of block data
txShares, evdShares, msgLens := rawShareCount(txs, evd)
msgShares := 0
for _, msgLen := range msgLens {
msgShares += msgLen
}

// calculate the smallest possible square size that could contain all the
// messages
squareSize := nextPowerOfTwo(int(math.Ceil(math.Sqrt(float64(txShares + evdShares + msgShares)))))

// the starting square size should be the minimum
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
if squareSize < consts.MinSquareSize {
squareSize = int(consts.MinSquareSize)
}

var fits bool
for {
// assume that all the msgs in the square use the non-interactive
// default rules and see if we can fit them in the smallest starting
// square size. We start the cusor (share index) at the begginning of
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
// the message shares (txShares+evdShares), because shares that do not
// follow the non-interactive defaults are simple to estimate.
fits, msgShares = shares.FitsInSquare(txShares+evdShares, squareSize, msgLens...)
switch {
// stop estimating if we know we can reach the max square size
case squareSize >= consts.MaxSquareSize:
return consts.MaxSquareSize, txShares + evdShares + msgShares
// return if we've found a square size that fits all of the txs
case fits:
return uint64(squareSize), txShares + evdShares + msgShares
// try the next largest square size if we can't fit all the txs
case !fits:
// increment the square size
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
squareSize = int(nextPowerOfTwo(squareSize + 1))
}
}
}

// rawShareCount calculates the number of shares taken by all of the included
// txs, evidence, and each msg.
func rawShareCount(txs []*parsedTx, evd core.EvidenceList) (txShares, evdShares int, msgLens []int) {
rootulp marked this conversation as resolved.
Show resolved Hide resolved
// msgSummary is used to keep track fo the size and the namespace so that we
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
// can sort the namespaces before returning.
type msgSummary struct {
size int
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
namespace []byte
}

var msgSummaries []msgSummary

// we use bytes instead of shares for tx and evd as they are encoded
// contiguously in the square, unlike msgs where each of which is assigned their
// own set of shares
txBytes, evdBytes := 0, 0
for _, pTx := range txs {
// if there is no wire message in this tx, then we can simply add the
// bytes and move on.
if pTx.msg == nil {
txBytes += len(pTx.rawTx)
continue
}

// if the there is a malleated tx, then we want to also account for the
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
// txs that gets included onchain. The formula used here over
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
// compensates for the actual size of the message, and in some cases can
// result in some wasted square space or picking a square size that is
// too large. TODO: improve by making a more accurate estimation formula
txBytes += overEstimateMalleatedTxSize(len(pTx.rawTx), len(pTx.msg.Message), len(pTx.msg.MessageShareCommitment))

msgSummaries = append(msgSummaries, msgSummary{shares.MsgSharesUsed(int(pTx.msg.MessageSize)), pTx.msg.MessageNameSpaceId})
}

txShares = txBytes / consts.TxShareSize
if txBytes > 0 {
txShares++ // add one to round up
}
// todo: stop rounding up. Here we're rounding up because the calculation for
// tx bytes isn't perfect. This catches those edge cases where we're we
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
// estimate the exact number of shares in the square, when in reality we're
// one byte over the number of shares in the square size. This will also cause
// blocks that are one square size too big instead of being perfectly snug.
// The estimation must be perfect or greater than what the square actually
// ends up being.
if txShares > 0 {
txShares++
}

for _, e := range evd.Evidence {
evdBytes += e.Size() + shares.DelimLen(uint64(e.Size()))
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
}

evdShares = evdBytes / consts.TxShareSize
if evdBytes > 0 {
evdShares++ // add one to round up
}

// sort the msgSummaries in order to order properly. This is okay to do here
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
// as we aren't sorting the actual txs, just their summaries for more
// accurate estimations
sort.Slice(msgSummaries, func(i, j int) bool {
return bytes.Compare(msgSummaries[i].namespace, msgSummaries[j].namespace) < 0
})

// isolate the sizes as we no longer need the namespaces
msgShares := make([]int, len(msgSummaries))
for i, summary := range msgSummaries {
msgShares[i] = summary.size
}
return txShares + 2, evdShares, msgShares
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
}

// overEstimateMalleatedTxSize estimates the size of a malleated tx. The formula it uses will always over estimate.
func overEstimateMalleatedTxSize(txLen, msgLen, sharesCommitments int) int {
// the malleated tx uses meta data from the original tx, but removes the
// message and extra share commitments. Only a single share commitment will
// make it on chain, and the square size (uint64) is removed.
malleatedTxLen := txLen - msgLen - ((sharesCommitments - 1) * 128) - 8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question] is it possible to document how this expression was derived?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we can do a better job of that. tried to with the comment above it, but refactored it to hopefully be clearer. 324966d

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notes from sync:

  • 128 is the # of bytes in a share commitment.
  • 8 looks like a magic number.

Is it possible to use constants for these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

7279a0d

and see #701

// we need to ensure that the returned number is at least larger than or
// equal to the actual number, which is difficult to calculate without
// actually malleating the tx
return appconsts.MalleatedTxBytes + 100 + malleatedTxLen
evan-forbes marked this conversation as resolved.
Show resolved Hide resolved
}

func nextPowerOfTwo(v int) int {
k := 1
for k < v {
k = k << 1
}
return k
}
Loading