-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identifiable Aborts #9
Comments
From a chat with @jonasnick : This doesn't work, at least not in the current RecPedPop with the certifying equality check. Claim: Participants need to ensure that they either sign the transcript or reveal the decryption key (but not both). Now what can happen in the certifying equality check is that honest participant A signs the transcript, and then participant B accuses A of having sent wrong shares. But honest A can't reveal decryption keys at this point because this would violate the above claim. Possible ways around this problem include:
These approaches could work as long as the coordinator is reliable. But if one wants to exclude the faulty participant responsible for the abort, and restart the protocol, then we'll need some kind of retry counter, which is also part of the transcript (e.g., in the context string), and further issues arise:
|
I don't think we'll include this. |
The simple solution is that B reveals the decryption key to the coordinator, who can decrypt all shares encrypted to B and check which of them are incorrect. No honest participant will terminate successfully due to the lack of a signature from B. I'm not sure why we hadn't considered this... But it's still true that this approach doesn't work with a deterministic protocol (and also certainly not with a protocol in which we reuse the decryption key as a long-term signing key). Adding a counter is dangerous because there's always the risk that an old counter value (from a failed session with revealed keys) is reused. Here's a sketch of some protocol modifications:
That approach isn't crazy. But still, the only thing we'd gain is a weak form of IA in which the coordinator is trusted to identify the faulty participant. I'm not convinced that this is worth making modifications. But I wanted to bring this up, after I had the above insight that weak (IA) is probably simpler than we previously thought. @LLFourn What's your thinking/approach to handle errors during the DKG in Frostsnap? By the way, an entirely different approach to get IA is PVSS (using verifiable encryption), but that's not simpler either. |
Taking a step back, I think here's a better approach: If the problem is that the coordinator aggregates shares, then we shouldn't work around this, but simply let the coordinator not aggregate shares in all cases.
While this needs more communication, it's much simpler than what I sketched above, and it's a good enough solution. My current thinking is that this is worth the hassle for achieving IA, and we should include this. IA should only be necessary if a device is malicious or broken, but it's really kind of bad to tell the user: Oh, something is wrong with one of these then devices, but it's impossible to tell you more. Just get five new devices and try again. The ability to pinpoint this to two devices (the blaming device could also be the culprit) makes this much easier, and it will also help with debugging. Footnotes
|
For a single-user keygen we're definitely not going for IA. The reason is that it requires PKI to be set up and verified by each other participant to be useful. Saying participant 7 is malicious is only useful if you know what participant 7 is in physical space. You must have established the identity of it already and mapped it to a physical device. How do you establish this? Probably through the UI of the coordinator device -- but you are trying to design something that is secure (correct IA in this case) with malicious coordinator which means it will probably just set this up initially by presenting the keys of participant 7 wrongly to all the devices ahead of time. We can ask the user to check with all the other devices that it has registered the correct key for participant 7 but this seems like more work. You could do this check in batch but this is not much different than just checking the views of all devices match at the end. We will eventually have to approach multi-user key generation and also key generation with non-frostsnap devices. Here it would be helpful to know who to blame. I can't come up with anything better than what you've suggested for the "blame by decrypting" approach. I would be very tempted to say "just use verifiable encryption" for the shares of some kind instead. If the underlying verifiable encryption is homomorphic then you can maintain the aggregating -- you know that if you get the wrong share it's always the coordinator's fault since it must have checked the proof before aggregating them. This means it's just a drop in replacement for the original protocol that doesn't add extra messages, it just increases the size (significantly) of the encrypted shares the participants send to the coordinator in the first step.
Just to check my understanding the first point is to just keep the recovery data as small as possible. |
Thanks for chiming in!
I understand that you need PKI for IA. But don't you need some form of out-of-band check anyway (e.g., the user compares all the screens of the devices once) to make sure that the coordinator is not doing a MitM attack against everyone? Our design is based on the idea that you'll need a user step anyway, and so we can also use this step to establish a PKI: every device computes a hash of all public keys, and we assume that it has been checked out of band that all the hashes match. Now that I'm writing this, I see that this could check could also be done later, e.g., when you create an address for the first time and anyway want to look at all the devices. By the way, it's interesting that you mention single-user vs multi-user explicitly. I had this distinction in the back of my mind somehow, but I never thought about how it would translate to different formal security models. Is it just this: In the multi-user setting, every user trusts their device. In the single-user setting, the user trusts some majority of the devices, but they don't know which ones.
Hm, not really. A malicious coordinator can always claim that some participant has sent garbage or nothing at all. (Blaming endpoints only makes sense if you know that the network is reliable, but in our case the coordinator is the network.) My thinking was only this: In "most" cases, the coordinator assigns blame, we'll need to explain somehow that the coordinator can never be fully trusted for this. But if we're in this special case that the shares are wrong, and then some participant A assigns blame, we can at least design it such that the coordinator cannot make participant A wrongly believe that participant B is to blame.
Indeed, it would be very useful to have a not-too-annoying scheme/spec readily available. From the perspective of the DKG, verifiable encryption will certainly be cleaner and simpler than verifiable decryption.
Indeed! |
Yeah you have to do the out-of-band check at some point with a possibly malicious coordinator. Actually talking about this with the team we came up with the idea of just explicitly setting up the PKI with our devices by providing a cert for a factory key. So a device wouldn't save a key unless e.g. 5 other devices had certified the keygen for a 3-of-5 -- without asking the user to actually eyeball that the 5 devices are the 5 they have in front of them. Assuming the user can trust us to set up and certify these keys correctly (one per device) I think this is sound. Now if we were to be malicious and give a particular malicious device (or the coordinator itself) a bunch of certified keys the worst they could do is trick an honest device or coordinator into a creating a key with a different set of participants. But as you said you have some chance to catch this in the "address check" anyway if you check with one of the honest devices that was excluded. But this requires you check the address with every device to get the guarantee which is unrealistic. We'd probably want a way to opt in to the "trustless" version too which would introduce the step where the user out of band checks the view of the public key setup. So I think this conversation has changed my mind that doing the PKI set up is the better way to go. The slight bit of complexity is that we'd probably want each device's keygen cert key key to be hardware secured which means we'd have to use RSA 😱 (or ECDSA on secp256kr1 if we're lucky!). I guess we could make the long term RSA/P-256 key sign a short term secp256k1 xonly key for the session so we could fully embrace this part of the spec.
Yeah this distinction really matters in our particular product design but not generally. I mostly think about the single-user setting where either all devices are corrupt and the coordinator is honest or the other way around. The reason is that in a supply-chain attack setting if we suppose an adversary has the ability to corrupt one device then why would they not corrupt them all. We try to get decent guarantees in the setting where all devices are corrupt but unable to exfil data back to the remote attacker. Of course we want security guarantees in the coordinator + In the multi-user setting we care more about IA so a malicious user cannot sabotage a key generation and then use social engineering to try and exclude some other participant.
Yes I the point I was laboring to make was confusing.
After I wrote this I realized that many of the schemes would not lend themselves easily to a coordinator that verifies and then aggregates. Interestingly the most naive scheme would: prove 256 ElGamal encryptions are to 0 (infinity) or 1 ( |
Sounds interesting, but I can't follow entirely. Are you saying that the certification is done in the factory, and in a 3-of-5, for every participant A, the other 5 (=4+1 coordinator) devices check the certification of A? This sounds reasonable to me then. My thinking about PKI has also changed a bit in the past days. I believe I had that insight earlier, but it slipped out of my mind. The somewhat obvious part is that you can't avoid MitM without some trust anchor in the real world. As a consequence, the assumption that every participant has obtained an authentic host public key (what we call the identity key) of every participant is more or less required in every payment scenario. But what's less obvious (I think): In many scenarios, this assumption is rather implicit, and we don't talk about it so much as we do here. And perhaps, emphasizing the comparison procedure (as we currently do in the BIP draft for example) so much is the wrong way to explain it to users and engineers, and we should just relate it to familiar scenarios more. In the simplest form: If you pay to the address of the wrong person, then you paid the wrong person. But even for multisig, this is not new: Suppose you use naive multisig or MuSig2 to create a multisig wallet non-interactively from a bunch of individual public keys. Sure, if you as a participant use the keys of the wrong people, then you have a multisig with the wrong people. In other words, the setup of any multisig wallet, even if much simpler than running DKG, needs the same checking of public keys. And existing setup procedures and user documentation gets this (mostly) right, so we could leave this part to engineers using the BIP, and we don't need to specify the exchange and comparison of public keys as an explicit step of the protocol. Having authentic keys is more like a setup assumption. It needs to be explained properly, but that's all. |
I think so. In the factory we sign each devices' public key with our factory one (i.e. set up PKI). PKI is set up beforehand so everyone can verify the other devices' keys against our factory key without doing an out of band check with the user. The out of band check is still necessary for the paranoid who don't want to rely on us not having leaked our certification key.
Yes this is as great point. In existing multisig theuser is constructing the descriptor manually by moving over the xpubs so this "checking" happens naturally as part of the UI workflow. In our scenario we have to prod them unexpectedly to do this check so the rest can be done automatically for them. The way they check against malicious "coordinator" is also to load the descriptor to the hardware wallet and do address checking from there. |
In our current code, it's, in fact, the participant who assigns blame (except in the case of invalid shares, of course). See #29. So with what I suggested above (#9 (comment)), we could also just stay in this model and always let the participants assign the blame. Advantages:
Disadvantages:
|
Quoting from above
As @real-or-random and I discussed offline, this approach doesn't quite work: in order to check correctness of the share, the signer needs access to the individual VSS commitments and not the summed one. This means that the coordinator also needs to send all individual VSS commitments which increases the communication to O(t*n) per signer. That's still a possible approach, but a larger change than it seemed initially. In the version where signers provide their decryption key to the coordinator to assign blame, we should ensure that the signers do not generate the same polynomial in subsequent DKG sessions because the coordinator has partial information about the shares. So this issue is related to #30. |
Note that we have static ECDH keys, so the victim can't reveal the ECDH secret key (because it's the hostseckey). The victim could, in theory, reveal the raw ECDH shared secret and give a proof of correct decryption using a DLEQ proof. We'll use the same raw ECDH shared secret in further sessions with the other participant, but the other participant was malicious anyway, so that's not a problem in theory. But it's still wrong in practice: Either the other participant is just malicious, but then you don't want to run again with them anyway. But if you really want to rerun with the same set of hostpubkeys (e.g. because some legitimate bug got fixed), then you don't want the ECDH secret to be known. We could switch to ephemeral ECDH keys, but this needs another round to exchange the keys. I don't see how some variant of committing symmetric encryption (e.g., add a hash MAC) would work, at least not without adding a round. The accusing participant could reveal the symmetric key to the coordinator, but if the accusing participant is malicious and reveals a garbage key, the coordinator doesn't know who's wrong and then the accused participant needs to exculpate herself by revealing the real key. If this is correct, then we're left only with annoying options:
|
That's not true. 5d93a8b switches to ephemeral-static ECDH, reusing the same ephemeral ECDH pubkey (let's call it pubnonce) for all recipients. Or more abstractly, this is a form of multi-recipient encryption (built from a multi-recipient KEM build from ECDH). That's a good middle-ground:
I've just noticed that the construction of multi-recipient KEM from ECDH in the paper linked above hashes slightly different stuff, for good reasons. We should switch just to that construction. |
Not true. The DLEQ proof will show that the ECDH shared secret is correct. From this, the symmetric key can be derived deterministically and decryption is also deterministic. Thus, there's no way for a malicious "victim" to provide wrong data that will lead to a wrong symmetric key. |
Done in 4fb6360. |
Tracked in #32 |
As @LLFourn points out in #32, this will make the victim provide the coordinator a static DH oracle, and this in turn breaks the protocol... Sigh. Thinking about this more, I suspect that all variants that involve revealing some key (or even the received share) are somewhat dangerous because they rely on fresh randomness. Even if we switch to fully ephemeral ECDH (which should be totally fine in theory), revealing the ECDH ephemeral secret key is only safe if the participant doesn't reuse randomness. So we're back to only these ideas:
None of these are great. O(tn) is not the end of the world, but I'm not convinced that it's worth the hassle. I assume it can be annoying, e.g., if some implementation wants to use QR codes. |
It doesn't have to be O(tn). Here's what we can do instead: The coordinator computes for every participant i the n-1 "partial" pubshares from the other n-1 participants and sends them to, in addition to the summed VSS commitments.1 This is still O(n) per message (if everyone gets a different message). The participants will check their secshare as usual, and only if it turns out to be wrong, they'll make use of the partial pubshare to identify the malicious share provider. The participant simply reports to the user who is to blame. This looks wrong because the participant need to trust the coordinator to compute the partial pubshares correctly, but this just the trust model for IA anyway. Since the actual check of secshare is unchanged, and the participants don't release additional data (no ECDH oracle...), unforgeability is not affected. This idea still has the drawback that it makes the interface complex from the perspective of the user: Usually the coordinator will output the blame assignment to the user, but in the case of wrong shares, the participant will do the output. And even in the latter case, we'll still need to trust the coordinator. So what the user needs to understand is something like participant A saying "Participant B has sent an invalid message, or the coordinator is broken." As @jonasnick pointed out to me, simply letting the coordinator output the blame assignment after obtaining it from the victim is not much better. Then the coordinator needs to take into account false accusations: "Participant A claims that participant B has sent an invalid message." This is not really less complex. But I still think it may be worth sending the accusation to the coordinator. Then, at least, it will always be the coordinator who outputs. And if applications prefer to let the victim provide output to the user, they can still decide to do so. Footnotes
|
This argument also works in the other direction. If the protocol doesn't include the message sending the accusation to the coordinator, then applications can still send a message to the coordinator if they want to. This would reduce the complexity of the IA spec by reducing the communication. |
True. At least as there's no clear superior decision, it's probably a good idea to do the simplest thing on the crypto side and leave the more UXish stuff entirely to the application. |
We forego robustness deliberately because it's hard/impossible in our restricted setting without a reliable broadcast channel or honest majority.
But we could aim for the weaker goal of identifiable aborts (IA), i.e., it's possible to tell who misbehaved to disrupt the protocol. Let's call the simple omission of messages "passive" disruption, and the contribution of wrong messages "active" disruption. Both forms constitute problems for us due to the lack of consensus and the possibility of equivocation. In the case of passive disruption, we can't even agree whether some participant omitted a message. In the case of active disruption, we could potentially have participants sign all their protocol messages so that victims of active disruption can generate "fraud proofs", but then due to the lack of broadcast, we can't ensure that everyone sees all valid fraud proofs, so we can't reach consensus on the set of participants to blame.
But what we can achieve in our restricted setting is a weaker form IA that assumes that the coordinator is honest and has reliable point-to-point connections to the participants. If the coordinator is honest, we make it authoritative for determining omissions. In the case of active disruption (=decrypted share is inconsistent with VSS commitments), a participant can reveal its decryption key to the coordinator, who can use it to identify the participant that has sent inconsistent protocol messages. (In variants where the coordinator aggregates VSS commitments or encrypted shares, this needs to be done by the coordinator who has the unaggregated messages.) This won't need additional signatures.
Revealing the decryption key, of course, means that a fresh one needs to be generated in a future retry. That is, if we want IA, we can't be fully deterministic.
The text was updated successfully, but these errors were encountered: