-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation for Row gossiping #443
Conversation
@@ -1042,7 +1036,7 @@ func (app *badApp) Commit() abci.ResponseCommit { | |||
//-------------------------- | |||
// utils for making blocks | |||
|
|||
func makeBlockchainFromWAL(wal WAL) ([]*types.Block, []*types.Commit, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liamsi, need your feedback on this. Previously, it was possible to make blockchain from WAL for testing from msgs only, but now that requires state as well. Confirm if the solution is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is only a test, I'm not really concerned much about the change. I'm wondering if this expands to other places using the WAL as well 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not
if cs.ProposalBlockRows.TotalSize() > int(cs.state.ConsensusParams.Block.MaxBytes) { | ||
return fmt.Errorf("propasal for block exceeding maximum block size (%d > %d)", | ||
cs.ProposalBlockRows.TotalSize(), cs.state.ConsensusParams.Block.MaxBytes, | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, instead of receiving parts, we can know in advance if the block exceeds the maximum size just by looking at Proposal.DAHeader
, which is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So instead of validating if the row is too large, by looking at the number of shares, you deduce this to the whole block and look if this would be too large. Interesting. I think that works and gives more assurance than just the row level. Looping in @adlerjohn to double-check if we need to change anything here for spec compliance (independent of your changes: we can keep determining the max block size by ConsensusParams.Block.MaxBytes, right?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were planning on eventually removing the DA header from proposal since it's not strictly necessary. If we do remove it then we need another way of determining block size, which fortunately is deteminable from the Header.availableDataOriginalSharesUsed
field.
The size of the original data square, availableDataOriginalSquareSize, isn't explicitly declared in the block header. Instead, it is implicitly computed as the smallest power of 2 whose square is at least availableDataOriginalSharesUsed (in other words, the smallest power of 4 that is at least availableDataOriginalSharesUsed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adlerjohn, in current tm's implementation(using PartSetHeader) a node checks for blocks size every time it receives a chunk(part). In the above implementation(using DAHeader) we only need a Proposal and zero chunks(rows) to understand that block exceeds the limit. In case we want to rely on Header.availableDataOriginalSharesUsed
, we would need to receive the entire block to compute the field and only after rejecting it.
consensus/state.go
Outdated
var commit *types.Commit | ||
switch { | ||
case cs.Height == cs.state.InitialHeight: | ||
// We're creating a proposal for the first block. | ||
// The commit is empty, but not nil. | ||
commit = types.NewCommit(0, 0, types.BlockID{}, nil) | ||
case cs.LastCommit.HasTwoThirdsMajority(): | ||
// Make the commit from LastCommit | ||
commit = cs.LastCommit.MakeCommit() | ||
default: // This shouldn't happen. | ||
return added, fmt.Errorf("no commit for the previous block") | ||
} | ||
|
||
cs.ProposalBlock = block | ||
cs.ProposalBlockRows, err = block.RowSet(context.TODO(), cs.dag) | ||
cs.ProposalBlock = cs.state.MakeBlock( | ||
cs.Proposal.Height, | ||
data.Txs, | ||
data.Evidence.Evidence, | ||
data.IntermediateStateRoots.RawRootsList, | ||
data.Messages, | ||
commit, | ||
cs.Validators.GetProposer().Address, | ||
) | ||
|
||
// TODO(Wondertan): This is unnecessary in general, but for now it writes needed fields | ||
// and specifically NumOriginalDataShares, which likely should be par of the proposal | ||
cs.ProposalBlockRows, err = cs.ProposalBlock.RowSet(context.TODO(), mdutils.Mock()) | ||
if err != nil { | ||
return false, err | ||
return added, err | ||
} | ||
cs.ProposalBlockParts = cs.ProposalBlock.MakePartSet(types.BlockPartSizeBytes) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is important to be reviewed. We were discussing that the Proposal needs to have LastCommit if row gossiping is implemented, but the node can also make it itself, is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I see a reason to add NumOriginalDataShares
to Proposal
, otherwise, validators would need to ComputeShares
themselves to get the number and add it to Header
.
// This test injects invalid field to block and checks if state discards it by voting nil | ||
func TestStateBadProposal(t *testing.T) { | ||
t.Skip("Block Executor don't have any validation for types.Data fields and we can't inject bad data there") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pointing to this. Currently don't have ideas on what validation rules to add to block exec for types.Data specifically.
NOTE: Proto breaking failure is expected |
NOTE: I bet EDS caching in Block should fix timeouts issues in CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a first preliminary pass: this is extra dope and I'm amazed by how quickly this was put together!
I think before we merge this, we should:
- run (realistic) experiments with the current block gossiping
- run (realistic) experiments and tests with the changed block gossiping here
- (although slightly orthogonal) better understand how we want to move forward with the full storage nodes and how we want to store the data on tendermint nodes
Also, I wonder if we should instead try to propose these changes suggested here and here to the tendermint team directly?
@marbar3778 @tessr is that sth the tendermint team would be e interested in or are erasure coding off the table and you guys are aiming to improve gossiping via other means?
for i, r := range rs.rows { | ||
r.ForEachShare(func(j int, share []byte) { | ||
shares[(i*size)+j] = share | ||
}) | ||
} | ||
return rsmt2d.ImportExtendedDataSquare( | ||
shares, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only to work around the fact that rsmt2d does not support incrementally handling rows directly, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we need to pass there only shares. Also, that should change to Repair.
if !rs.DAHeader.RowsRoots[row.Index].Equal(&root) { | ||
return false, ErrInvalidRow | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to myself: what happens after we detected an invalid row?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can swap out the row with a part in the question. We can try following tm's behavior
@@ -1042,7 +1036,7 @@ func (app *badApp) Commit() abci.ResponseCommit { | |||
//-------------------------- | |||
// utils for making blocks | |||
|
|||
func makeBlockchainFromWAL(wal WAL) ([]*types.Block, []*types.Commit, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is only a test, I'm not really concerned much about the change. I'm wondering if this expands to other places using the WAL as well 🤔
if cs.ProposalBlockRows.TotalSize() > int(cs.state.ConsensusParams.Block.MaxBytes) { | ||
return fmt.Errorf("propasal for block exceeding maximum block size (%d > %d)", | ||
cs.ProposalBlockRows.TotalSize(), cs.state.ConsensusParams.Block.MaxBytes, | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So instead of validating if the row is too large, by looking at the number of shares, you deduce this to the whole block and look if this would be too large. Interesting. I think that works and gives more assurance than just the row level. Looping in @adlerjohn to double-check if we need to change anything here for spec compliance (independent of your changes: we can keep determining the max block size by ConsensusParams.Block.MaxBytes, right?).
@@ -384,7 +385,7 @@ func byzantineDecideProposalFunc(t *testing.T, height int64, round int32, cs *St | |||
// Avoid sending on internalMsgQueue and running consensus state. | |||
|
|||
// Create a new proposal block from state/txs from the mempool. | |||
block1, blockParts1, _ := cs.createProposalBlock() | |||
block1, blockParts1, blockRows1 := cs.createProposalBlock(cs.privValidatorPubKey.Address()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not asking to do this in this PR but I'm wondering if the blockParts would be removed entirely as part of this work. This also trickles into the storage I guess? As tendermint currently stores the data in parts 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that should also affect storing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mad props. I think I have a much clearer picture now that it's basically already implemented 😅
Sorry I don't have much to add/comment, I went through each change and everything seems rational. I'll try and re-review after chewing on it more and getting a better grasp of the consensus reactor.
Any failing non e2e tests just seem flaky
As for the e2e, the CI isn't posting logs, so I figured I'd post some here. It looks like there are some nil references occuring while handling messages
full01 logs
generating ED25519 keypair...done
peer identity: 12D3KooWP8efcDSYcrhnJxobCSgCtY1ZvA5uGf8rtJqGZspbSAA1
I[2021-07-07|23:48:18.764] Successfully initialized IPFS repository module=main ipfs-path=ipfs
I[2021-07-07|23:48:19.671] Successfully created embedded IPFS node module=main ipfs-repo=ipfs
I[2021-07-07|23:48:19.672] Version info module=main software= block=11 p2p=8
I[2021-07-07|23:48:19.681] Starting Node service module=main impl=Node
I[2021-07-07|23:48:19.682] Starting StateSyncShim service module=statesync impl=StateSyncShim
I[2021-07-07|23:48:19.682] Starting StateSync service module=statesync impl=StateSync
I[2021-07-07|23:48:20.020] Executed block module=state height=1000 validTxs=11 invalidTxs=0
I[2021-07-07|23:48:20.021] Committed state module=state height=1000 txs=11 appHash=99E8778DB43EF2EE8797F2EBE67C3034C2670347315361228BAEDF6D07509E8E
I[2021-07-07|23:48:20.047] Executed block module=state height=1001 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:20.047] Committed state module=state height=1001 txs=7 appHash=DF439F355ED9C29CCF0B8D2562EE772EC5F88FC2FE9FAA5706E340C2CF7799A0
I[2021-07-07|23:48:20.075] Executed block module=state height=1002 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:20.076] Committed state module=state height=1002 txs=7 appHash=DF61E17EF23DE2E4CE6E10828043830C0045F12051D1CE1FF09C4128D82C8A71
I[2021-07-07|23:48:20.104] Executed block module=state height=1003 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:20.105] Committed state module=state height=1003 txs=7 appHash=FBA99B42D277EB6C14444EF39A8FFD0D4C3373C40AC134AD2E9BF3F318302430
I[2021-07-07|23:48:20.132] Executed block module=state height=1004 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:20.133] Committed state module=state height=1004 txs=7 appHash=0996D3AE39D50B34E6669853B8FA5BFBC423108ABF7ED96A569715B42EB0A035
I[2021-07-07|23:48:20.163] Executed block module=state height=1005 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:20.165] Committed state module=state height=1005 txs=7 appHash=EB1CDE49EB0BA750E500EBF13A2B3859ABBD5B2AFB0C5700A5FC9E9A088F6995
I[2021-07-07|23:48:20.196] Executed block module=state height=1006 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:20.198] Committed state module=state height=1006 txs=7 appHash=498850D69E85E8CC00CFFA921ECF0C72ED48C72DE3B344F48B04BEC0E53DD691
I[2021-07-07|23:48:20.225] Executed block module=state height=1007 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:20.226] Committed state module=state height=1007 txs=7 appHash=37E3B36FC00D0AD4B1A64FE94F8F1B3877B6931D9E494724A6D6CC9A489D9DFA
I[2021-07-07|23:48:20.252] Executed block module=state height=1008 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:20.254] Committed state module=state height=1008 txs=7 appHash=239E0EA404DF6B8C6777091C440D679D82A49F6BF05C7964C4D5E26B2EE8629C
I[2021-07-07|23:48:20.280] Executed block module=state height=1009 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:20.281] Committed state module=state height=1009 txs=7 appHash=AFDA7FF9F989343376E181AD37130EF1F51572771B8EAC48DAC5AFC49B608B7E
I[2021-07-07|23:48:20.307] Executed block module=state height=1010 validTxs=6 invalidTxs=0
I[2021-07-07|23:48:20.307] Updates to validators module=state updates=32DC06149F04267667E5653B361373206B1536C6:50
I[2021-07-07|23:48:20.309] Committed state module=state height=1010 txs=6 appHash=54E872FADD688FA35F6E98135180B466AAE90C5FC711390C5BCF7FEA96DCAC06
E[2021-07-07|23:48:20.887] CONSENSUS FAILURE!!! module=consensus err="runtime error: invalid memory address or nil pointer dereference" stack="goroutine 11472 [running]:\nruntime/debug.Stack(0xc0212753a0, 0x25034e0, 0x3a0f350)\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x9f\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).receiveRoutine.func2(0xc01322a380, 0x2a72f68)\n\t/src/tendermint/consensus/state.go:720 +0x57\npanic(0x25034e0, 0x3a0f350)\n\t/usr/local/go/src/runtime/panic.go:969 +0x1b9\ngithub.com/lazyledger/lazyledger-core/consensus.(*Reactor).broadcastNewValidBlockMessage(0xc01321cd80, 0xc01322a448)\n\t/src/tendermint/consensus/reactor.go:443 +0x67\ngithub.com/lazyledger/lazyledger-core/consensus.(*Reactor).subscribeToBroadcastEvents.func2(0x263c320, 0xc01322a448)\n\t/src/tendermint/consensus/reactor.go:415 +0x45\ngithub.com/lazyledger/lazyledger-core/libs/events.(*eventCell).FireEvent(0xc0196e6460, 0x263c320, 0xc01322a448)\n\t/src/tendermint/libs/events/events.go:198 +0x1e3\ngithub.com/lazyledger/lazyledger-core/libs/events.(*eventSwitch).FireEvent(0xc000ae6850, 0x27e7d34, 0xa, 0x263c320, 0xc01322a448)\n\t/src/tendermint/libs/events/events.go:158 +0xa7\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).enterCommit(0xc01322a380, 0x3f3, 0x0)\n\t/src/tendermint/consensus/state.go:1520 +0x971\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).addVote(0xc01322a380, 0xc00c7f5220, 0xc006cf22d0, 0x28, 0xc021275ac8, 0xe35ea7, 0xc01322a438)\n\t/src/tendermint/consensus/state.go:2159 +0xbe5\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).tryAddVote(0xc01322a380, 0xc00c7f5220, 0xc006cf22d0, 0x28, 0x3b9eae0, 0x34de33d7, 0xed8783444)\n\t/src/tendermint/consensus/state.go:1954 +0x59\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).handleMsg(0xc01322a380, 0x2c343e0, 0xc020fce7d0, 0xc006cf22d0, 0x28)\n\t/src/tendermint/consensus/state.go:820 +0x865\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).receiveRoutine(0xc01322a380, 0x0)\n\t/src/tendermint/consensus/state.go:753 +0x7d6\ncreated by github.com/lazyledger/lazyledger-core/consensus.(*State).OnStart\n\t/src/tendermint/consensus/state.go:393 +0x896\n"
E[2021-07-07|23:48:39.007] Error on broadcastTxCommit module=rpc err="timed out waiting for tx to be included in a block"
E[2021-07-07|23:48:45.516] Error on broadcastTxCommit module=rpc err="timed out waiting for tx to be included in a block"
E[2021-07-07|23:48:49.008] Error on broadcastTxCommit module=rpc err="timed out waiting for tx to be included in a block"
E[2021-07-07|23:48:49.785] Stopping peer for error module=p2p peer="Peer{MConn{10.186.73.4:26656} 0a9c99096b50a6d72a24d5aa5286d3f7022b3555 out}" err=EOF
E[2021-07-07|23:48:49.785] Stopping peer for error module=p2p peer="Peer{MConn{10.186.73.5:26656} 8773a83e2f9fa4bc7a3e94303eb4df33af294288 out}" err=EOF
validator02 logs
I[2021-07-07|23:47:49.387] Starting SignerServer service impl=SignerServer
I[2021-07-07|23:47:49.387] Remote signer connecting to tcp://0.0.0.0:27559
D[2021-07-07|23:47:49.387] SignerDialer: Reconnection failed retries=1 max=100 err="dial tcp 0.0.0.0:27559: connect: connection refused"
D[2021-07-07|23:47:50.387] SignerDialer: Reconnection failed retries=2 max=100 err="dial tcp 0.0.0.0:27559: connect: connection refused"
D[2021-07-07|23:47:51.388] SignerDialer: Connection Ready
generating ED25519 keypair...done
peer identity: 12D3KooWPuqfdneFeVXwk9oMuGfqadLdG727nnsF4pTUBtuzcp4L
I[2021-07-07|23:47:51.399] Successfully initialized IPFS repository module=main ipfs-path=ipfs
I[2021-07-07|23:47:52.558] Successfully created embedded IPFS node module=main ipfs-repo=ipfs
I[2021-07-07|23:47:52.559] Version info module=main software= block=11 p2p=8
I[2021-07-07|23:47:52.570] Starting Node service module=main impl=Node
I[2021-07-07|23:47:52.572] Starting StateSyncShim service module=statesync impl=StateSyncShim
I[2021-07-07|23:47:52.572] Starting StateSync service module=statesync impl=StateSync
E[2021-07-07|23:47:52.674] Stopping peer for error module=p2p peer="Peer{MConn{10.186.73.5:26656} 8773a83e2f9fa4bc7a3e94303eb4df33af294288 out}" err=EOF
I[2021-07-07|23:47:59.918] Executed block module=state height=1000 validTxs=11 invalidTxs=0
I[2021-07-07|23:47:59.918] Committed state module=state height=1000 txs=11 appHash=99E8778DB43EF2EE8797F2EBE67C3034C2670347315361228BAEDF6D07509E8E
I[2021-07-07|23:48:01.575] Executed block module=state height=1001 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:01.576] Committed state module=state height=1001 txs=7 appHash=DF439F355ED9C29CCF0B8D2562EE772EC5F88FC2FE9FAA5706E340C2CF7799A0
I[2021-07-07|23:48:03.117] Executed block module=state height=1002 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:03.118] Committed state module=state height=1002 txs=7 appHash=DF61E17EF23DE2E4CE6E10828043830C0045F12051D1CE1FF09C4128D82C8A71
I[2021-07-07|23:48:05.152] Executed block module=state height=1003 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:05.153] Committed state module=state height=1003 txs=7 appHash=FBA99B42D277EB6C14444EF39A8FFD0D4C3373C40AC134AD2E9BF3F318302430
I[2021-07-07|23:48:06.781] Executed block module=state height=1004 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:06.782] Committed state module=state height=1004 txs=7 appHash=0996D3AE39D50B34E6669853B8FA5BFBC423108ABF7ED96A569715B42EB0A035
I[2021-07-07|23:48:08.565] Executed block module=state height=1005 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:08.566] Committed state module=state height=1005 txs=7 appHash=EB1CDE49EB0BA750E500EBF13A2B3859ABBD5B2AFB0C5700A5FC9E9A088F6995
I[2021-07-07|23:48:10.265] Executed block module=state height=1006 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:10.266] Committed state module=state height=1006 txs=7 appHash=498850D69E85E8CC00CFFA921ECF0C72ED48C72DE3B344F48B04BEC0E53DD691
E[2021-07-07|23:48:11.264] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.264] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.264] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.264] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.264] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.264] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Failed to provide to DHT module=consensus height=1002 err="context canceled"
E[2021-07-07|23:48:11.266] Providing Block didn't finish in time and was terminated module=consensus height=1002
I[2021-07-07|23:48:11.856] Executed block module=state height=1007 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:11.862] Committed state module=state height=1007 txs=7 appHash=37E3B36FC00D0AD4B1A64FE94F8F1B3877B6931D9E494724A6D6CC9A489D9DFA
I[2021-07-07|23:48:13.498] Executed block module=state height=1008 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:13.499] Committed state module=state height=1008 txs=7 appHash=239E0EA404DF6B8C6777091C440D679D82A49F6BF05C7964C4D5E26B2EE8629C
I[2021-07-07|23:48:15.592] Executed block module=state height=1009 validTxs=7 invalidTxs=0
I[2021-07-07|23:48:15.593] Committed state module=state height=1009 txs=7 appHash=AFDA7FF9F989343376E181AD37130EF1F51572771B8EAC48DAC5AFC49B608B7E
I[2021-07-07|23:48:17.204] Executed block module=state height=1010 validTxs=6 invalidTxs=0
I[2021-07-07|23:48:17.204] Updates to validators module=state updates=32DC06149F04267667E5653B361373206B1536C6:50
I[2021-07-07|23:48:17.205] Committed state module=state height=1010 txs=6 appHash=54E872FADD688FA35F6E98135180B466AAE90C5FC711390C5BCF7FEA96DCAC06
I[2021-07-07|23:48:18.784] Executed block module=state height=1011 validTxs=5 invalidTxs=0
I[2021-07-07|23:48:18.785] Committed state module=state height=1011 txs=5 appHash=0DEA959BBD42E163234AD3C1C46539B6DE1C646C693E1063AFB01FFECE7B024A
E[2021-07-07|23:48:19.766] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.767] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Failed to provide to DHT module=consensus height=1007 err="context canceled"
E[2021-07-07|23:48:19.768] Providing Block didn't finish in time and was terminated module=consensus height=1007
E[2021-07-07|23:48:28.876] Error on broadcastTxCommit module=rpc err="timed out waiting for tx to be included in a block"
E[2021-07-07|23:48:33.502] Error on broadcastTxCommit module=rpc err="timed out waiting for tx to be included in a block"
seed02 logs
generating ED25519 keypair...done
peer identity: 12D3KooWSKZKJNGD45HbTA8PEBTGgETi6HQ6cBo5aYB6ixh4KLb8
I[2021-07-07|23:47:46.008] Successfully initialized IPFS repository module=main ipfs-path=ipfs
I[2021-07-07|23:47:46.947] Successfully created embedded IPFS node module=main ipfs-repo=ipfs
I[2021-07-07|23:47:46.947] Version info module=main software= block=11 p2p=8
I[2021-07-07|23:47:46.957] Starting Node service module=main impl=Node
I[2021-07-07|23:47:46.957] Starting StateSyncShim service module=statesync impl=StateSyncShim
I[2021-07-07|23:47:46.958] Starting StateSync service module=statesync impl=StateSync
E[2021-07-07|23:47:47.058] Stopping peer for error module=p2p peer="Peer{MConn{10.186.73.4:26656} 0a9c99096b50a6d72a24d5aa5286d3f7022b3555 out}" err=EOF
E[2021-07-07|23:48:17.063] Stopping peer for error module=p2p peer="Peer{MConn{10.186.73.4:26656} 0a9c99096b50a6d72a24d5aa5286d3f7022b3555 out}" err=EOF
E[2021-07-07|23:48:17.262] CONSENSUS FAILURE!!! module=consensus err="runtime error: invalid memory address or nil pointer dereference" stack="goroutine 1116 [running]:\nruntime/debug.Stack(0xc013ed93a0, 0x25034e0, 0x3a0f350)\n\t/usr/local/go/src/runtime/debug/stack.go:24 +0x9f\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).receiveRoutine.func2(0xc013f32a80, 0x2a72f68)\n\t/src/tendermint/consensus/state.go:720 +0x57\npanic(0x25034e0, 0x3a0f350)\n\t/usr/local/go/src/runtime/panic.go:969 +0x1b9\ngithub.com/lazyledger/lazyledger-core/consensus.(*Reactor).broadcastNewValidBlockMessage(0xc000c6ba80, 0xc013f32b48)\n\t/src/tendermint/consensus/reactor.go:443 +0x67\ngithub.com/lazyledger/lazyledger-core/consensus.(*Reactor).subscribeToBroadcastEvents.func2(0x263c320, 0xc013f32b48)\n\t/src/tendermint/consensus/reactor.go:415 +0x45\ngithub.com/lazyledger/lazyledger-core/libs/events.(*eventCell).FireEvent(0xc0196ba8e0, 0x263c320, 0xc013f32b48)\n\t/src/tendermint/libs/events/events.go:198 +0x1e3\ngithub.com/lazyledger/lazyledger-core/libs/events.(*eventSwitch).FireEvent(0xc0001990a0, 0x27e7d34, 0xa, 0x263c320, 0xc013f32b48)\n\t/src/tendermint/libs/events/events.go:158 +0xa7\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).enterCommit(0xc013f32a80, 0x3e8, 0x0)\n\t/src/tendermint/consensus/state.go:1520 +0x971\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).addVote(0xc013f32a80, 0xc01a3bcfa0, 0xc01b88cf90, 0x28, 0x21b4787, 0xc00055d8c0, 0xc013fae480)\n\t/src/tendermint/consensus/state.go:2159 +0xbe5\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).tryAddVote(0xc013f32a80, 0xc01a3bcfa0, 0xc01b88cf90, 0x28, 0x3b9eae0, 0xf9d3666, 0xed8783441)\n\t/src/tendermint/consensus/state.go:1954 +0x59\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).handleMsg(0xc013f32a80, 0x2c343e0, 0xc0011d2b50, 0xc01b88cf90, 0x28)\n\t/src/tendermint/consensus/state.go:820 +0x865\ngithub.com/lazyledger/lazyledger-core/consensus.(*State).receiveRoutine(0xc013f32a80, 0x0)\n\t/src/tendermint/consensus/state.go:753 +0x7d6\ncreated by github.com/lazyledger/lazyledger-core/consensus.(*State).OnStart\n\t/src/tendermint/consensus/state.go:393 +0x896\n"
E[2021-07-07|23:50:17.061] Stopping peer for error module=p2p peer="Peer{MConn{10.186.73.4:26656} 0a9c99096b50a6d72a24d5aa5286d3f7022b3555 out}" err=EOF
E[2021-07-07|23:52:17.061] Stopping peer for error module=p2p peer="Peer{MConn{10.186.73.4:26656} 0a9c99096b50a6d72a24d5aa5286d3f7022b3555 out}" err=EOF
also, the full02
node isn't booting up. full02
only connects via a seed node after height 1000, so it relies on some genesis state. It could be that the initial state wasn't able to be validated. That was atleast the cause here
if part == nil { | ||
logger.Error("Could not load part", "index", index, | ||
"blockPartSetHeader", blockMeta.BlockID.PartSetHeader, "peerBlockPartSetHeader", prs.ProposalBlockPartSetHeader) | ||
rs, err := b.RowSet(context.TODO(), mdutils.Mock()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we using a mock here just to not bother saving the data via PutBlock
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Also, I aim to change that before merging. I don't like the current approach I've taken with RowSet. It is not practical for some cases like this.
d0c602a
to
a74bc01
Compare
@Wondertan are you OK with closing this PR as well? I'd keep the branch around as we might pick that up in the future again. |
Implementation of #43 is ready for review. There is still work remaining to be done, but the required skeleton is implemented and should be reviewed before any further changes. Currently, we skip a few tests related to Polka cases and will unSkip those before merging.
PR bases on #427 and looks there for convenience. #427 should be merged before this.
TODO(All this should be done before merging)