-
Notifications
You must be signed in to change notification settings - Fork 5
proper synchronized mining #143
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome.
I notice a lot of messages like this:
Jul 14 14:52:33.298557 INFO 101.6153s ERROR << instance 3 >> 2020-07-14T10:52:33.298-0400 WARN miner miner/miner.go:179 mined block in the past {"block-time": "2020-07-15T10:44:12.000-0400", "time": "2020-07-15T10:44:12.100-0400", "duration": 0.1} {"req_id": "8ac15ff3"}
Jul 14 14:52:33.312636 INFO 101.6294s ERROR << instance 4 >> 2020-07-14T10:52:33.312-0400 WARN sub sub/incoming.go:81 Received block with large delay 25 from miner t01001 {"req_id": "8ac15ff3"}
I think the mined block in the past
message is happening because we start at genesisTime + 100ms
and advance by 1 epoch interval at each step, so we're always 100ms after the epoch when we actually mine the block. Maybe we should mine at a point midway between epochs? I think that would be closer to the real-world scenario, since miners want to push blocks before the epoch so they can propagate.
is a bit of a red herring. The block validator is not discarding blocks we've already received, so it appears that gossipsub delivers can deliver the same block multiple times. The first time we receive the block, we incorporate it to our HEAD, and when all blocks are received, we advance time. If due to gossipsub amplification, a duplicate block is received after we've fast-forwarded the clock 25s, it will still be delivered and will cause this warning. |
Not totally familiar with gossipsub internals but we should drop (valid) repeated blocks in |
lotus-soup/testkit/node_time.go
Outdated
build.Clock = n.MockClock | ||
|
||
// start at genesis + 100ms. | ||
n.MockClock.Set(genesisTime.Add(100 * time.Millisecond)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still triggers (after last set of patches) the mined block in the past
and related. In total we are advancing BlockDelaySecs
plus this slight drift, putting us outside of the mining epoch window: the miner only advances BlockDelaySecs
from its tipset and sees that it lost the mining window by 100ms. Ideally this should be a subtraction but in my local tests this breaks other assumptions (understandably because I'm drifting before genesis time). Maybe we should jump a smaller amount than BlockDelaySecs
the first time (to remain inside the mining window when adding the full BlockDelaySecs
amount in subsequent rounds).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the simplest solution here would be to advance the clock after the mining round and not before as it seems to be done now. (I think the change would translate to something like moving the clock advance logic from the local to the global case in SynchronizeClock
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make sense. I did try different variants and in some cases I ended up hitting the "got a block from the future" error. I think we do need to get this absolutely right, so I'm willing to invest time in nailing the state machine right now so that we're all happy and we can build on a solid base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think advancing the clock just before the mining round won't work because of races between miners. Miners who advance their clock sooner will end up propagating what appear to be "blocks from the future" to miners who still haven't advanced their clocks. That's why I opted for this approach, as we have the assurance that all miners are in the right epoch when mining commences in concert.
After removing the drift, and testing atop this patch, we no longer have the spurious logs.
The only one we're left with is the occasional:
WARN sub sub/incoming.go:81 Received block with large delay 25 from miner t01001 {"req_id": "a305c648"}
Which I think has to do with gossipsub propagation, and not worth investigating right now IMO.
I'm going to give it an in-depth review and run tomorrow but the new logic decoupling of this last iteration is beautiful 🎨 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, left some comments before giving this a run, nothing major.
Co-authored-by: Lucas Molas <[email protected]>
Co-authored-by: Lucas Molas <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking very good. All the major comments have been addressed. I still need to run this and check the "mined in the past" delay but from previous comments it seems this has been fixed.
One last request. I need some more clarity on the channels that control this system. I left some untidy notes about this which might not add too much value (apologies for that, I can add more clarifications later when I have more time); their intent is to try to illustrate some of the confusions I have when I try to reason about this code. I actually know what the code is doing, but that is mainly because I'm very familiar with the protocol and I think how would I go about doing this, and it's basically aligned with this implementation. But when I actually read through the synchronization logic I still take a noticeable cognitive hit when following the channels from one function to another.
// all blocks that were produced in this epoch. | ||
// * once that happens, we emit an epoch end signal, which the caller can | ||
// interpret as a signal to advance the clock and begin a new epoch. | ||
func (m *LotusMiner) SynchronizeMiner(ctx context.Context, epochStart <-chan abi.ChainEpoch, epochEnd chan<- abi.ChainEpoch) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
epochStart
:
- what do we start? Mining. It seems basically a "go mine" signal.
- epoch is confusing, it may seem to imply that we mine on the epoch passed in the channel when in fact the epoch is used just for debugging purposes
- the channel seems to be doing to things at once: signal to mine (this could just be an empty struct) and also tell you in which epoch the miner is standing on (do we need to obtain that information from here?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
epochEnd
: similar, do we even read the epoch we're sending? or is this just i'm done syncing, the terms epoch and end seem to imply to me we are controlling the clock here but i think this is just a signal that i (miner) did something (either mined or synced the chain, unrelated to start/end epoch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically this routine is intended to behave like an "actor", as in the actor-oriented programming model programming. It receives signals to work on epochStart
and emits "work done" signals on epochEnd
. In Akka, you would assimilate the epochStart
channel to an inbox, and epochEnd
to an outbox (using their mailbox analogy).
I haven't thought too much about how these epochs are actually used; I started making some of these channels of element type struct{}
, but then I realised that I might as well propagate useful info. Although in hindsight, I see how that decision (and loose usage of the sent/recvd values) might confuse others.
How about we name these channels mineStart
and mineDone
? From the viewpoint of this routine, it establishes the clear boundary of signals that demarcate each unit of work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those names sound good.
// | ||
// This method will return immediately, and will spawn a background goroutine | ||
// that performs the clock synchronisation work. | ||
func (n *LotusNode) SynchronizeClock(ctx context.Context) (localAdvance chan<- abi.ChainEpoch, globalEpoch <-chan abi.ChainEpoch, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
global vs local: i have trouble grappling with the topology implied by this locality
the user requests local advancement
: it doesn't seem to me the user (miner) requests advancement, it just says
i've synced to this epoch, yes, you can interpret it as "i'm done let's move one", but the local/global implies to me some equality that is differentiated between how we order the group, but here we have miners and a central time/chain control, the miners don't have any leverage on the control (in more advance use cases even if the miner says i'm done in this round we may still not advance until some other event happens), i don't see the relation that could classify them into local/global with a centralized control system
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SynchronizeClock
is another actor (actor-oriented programming). It encapsulates all the control of the clock. No other process in the system (not even the SynchronizeMiner
actor) is entitled to change the clock (although we don't enforce that prohibition).
The idea behind this is that when the local node, when enlisted predicates are satisfied, decides that it's ready to advance the clock, it would send a signal to localAdvance
.
Right now we ACTUALLY advance the local clock and advertise that fact to the world. In the future, we could use some form of 3PC (three phase commit), where we PREPARE [phase 1] the advancement, nodes opt in or opt out of that round via the sync service, and once all nodes have PREPARED, we actually COMMIT [phase 2] the local clock advancement by manipulating the mock clock, and CERTIFY [phase 3] that we are at the new epoch.
Once all nodes have passed the CERTIFY
phase, that's when we release the signal on globalEpoch
to trigger mining and/or other downstream processes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think I was having the 3PC model you're describing in my mind. I'm fine if this is not implemented right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
90% of the Received block with large delay
warnings are coming from the client (instance 4) which is running one epoch ahead (not sure why), so I'm not sure I agree with the earlier diagnostic of gossipsub amplification. I would like to understand and preferably fix (or justify) this to look in depth at the other 10% of miner delays. (From what I've seen miner delays come from blocks the same miner mined, which the validator does not discard because in fact it is the first time it sees, but I would like to confirm this.)
edit: There is also a soft race condition (for lack of a better term) where each node advances its clock once all miners are done mining but not necessarily after all nodes are done syncing, which also might explain the reported delays (not totally sure, I'm still having trouble following the local/end global/start channels back and forth).
go func() { | ||
// for now, clients are always one epoch ahead locally with regards to miners. | ||
// keeping the flywheel spinning! | ||
for epoch := range globalEpochCh { | ||
localAdvanceCh <- epoch + 1 | ||
} | ||
}() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow the rationale behind this advancement from the comment.
@schomatis Good job catching the fact that it's the client that reports the "Received block with large delay" message. In general, we need to revisit the client's behaviour in all this synchronisation. I'm not convinced by the current solution. The decision I had to take was: "do I want for the client to wait until its HEAD is updated, just like miners do?" It seemed a bit too heavy-handed to me, so that's why I decided not to wait for that condition right now, because this PR is just the first step to get fully deterministic tests (a) working [see #146], (b) doing useful stuff (we will need to simulate faults, forks, mining halting, etc.) Right now the advancement logic is hardcoded, but to enable (b), we need some richer semantics/API where we can attach dynamically-evaluated predicates such that:
In conclusion, I'm not so worried about getting everything right at this precise instance, because this PR is just laying the groundwork and we'll iterate on this code a lot as we introduce those predicates, checkpoints, etc. |
I don't fully understand all the details here so I still don't have complete confidence in this implementation (that is on me, not an issue with the code) but I likely won't have time to review anything else today so if this is blocking the ongoing effort feel free to go ahead and merge this, we can iterate on this as we go. |
// build.BlockDelaySecs = 2 | ||
// build.PropagationDelaySecs = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to sort this out. Synchronised mining doesn't need to reduce the block delay and propagation delay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general none of the future features proposed here seem to me will need two different nodes to see a different time. The biggest goal of this PR should be just to make sure all the nodes are in sync, what each node does (mine or not mine, stay in consensus or fork) can be controlled by other means. Having these "local" clocks seems like adding too much complexity for a first iteration without much benefit, and as shown in #154 it also causes clock misalignments that result in unwanted behavior. Still, I'm approving this PR to explicitly signal I don't want to stall this effort moving forward and what I'm describing here could potentially be fixed in later iterations (at a bigger cost IMO than discussing and fixing it here though.)
This PR introduces proper synchronized mining via two control planes:
With this PR, we no longer need to backdate genesis nor decrease the
BlockDelaySecs
andPropagationDelaySecs
configurations to accelerate mining.Global clock control plane
Both miners and clients participate in this control plane. It is used to drive time in mining. Time starts at now, and moves into the future, one
BlockDelaySecs
at a time.This is essentially modelled as a distributed semaphore, where all miners and clients wait until they have locally advanced their clocks. Once they all signal that fact via a topic, the semaphore allows them all to proceed at once.
The semaphore fires through a channel, which is then fed to the mining control plane.
Mining control plane
Only miners participate in this control plane. It feeds off the global clock control plane channel to conduct mining rounds.
As soon as the clock progresses, we attempt to mine a block. We then publish a message on a topic indicating who we are (address), and if we did produce a block or not.
Next, we subscribe to HEAD changes (which sends us a snapshot of our current head, plus any subsequent changes), and await until our HEAD becomes the tipset that includes all blocks we know were produced in that round.
At that point, we release a signal via a channel that backfeeds into the global clock control plane, which triggers the next collective epoch.
Fixes #135.
Partially addresses #87.
Uses filecoin-project/lotus#2362.
Future
This solution synchronises all miners to perfection, and does not allow us to simulate disruptions, faults, delays, etc. That said, the modularity and the delineation through clean input/output channels should allow us to mangle and manipulate things locally. However, we will need to think of ways to advertise to peers when we deliberately "opt out" of a round, as right now, the code expects full participation of all miners in all mining rounds.