May 27, 2019

Tari Protocol Discussion 39

On Monday, the Tari community discussed the Tari Digital Assets Network. Below is the TL;DR on Monday’s conversation (full transcript included below, only edited for spelling):

  • Comprises exist between speed and decentralisation
  • Should try and minimize communication
  • Dag’s seem promising

Join us for our next discussion on Freenode in #tari-dev.

Discussion times proposed by the Tari community:

Mondays: 6pm CAT (12pm EST)

Thursdays: 11am CAT (5am EST)

To keep up with the latest Tari developments, you can follow the project on Twitter.

Transcript of Monday’s discussion

Today's dev chat should be pretty interesting
18:02 <@CjS77>  For those not familiar with the idea, the Tari Digital Assets Network is our proposed smart contracts platform as described here: https://rfc.tari.com/RFC-0300_DAN.html
18:04 <@CjS77>  We've chatted at length about Validator nodes and committees, but we've never discussed how these VN committees reach consensus on the state of the contracts they're managing
18:04 <@CjS77>  I think it's a good time to open up this topic to discussion
18:06 <@CjS77>  To summarise, VN committees are
18:06 <@CjS77>  * A set of servers entrusted by an Asset Issuer (AI) to manage the state and handle state transitions for the AI's digital asset(s).
18:06 <@CjS77>  * Ultimately selected by the AI, but _could have been_ self-nominated
18:07 <@CjS77>  * Potentially adversarial (One doesn't need permission to operate a VN and nominate it for Asset Committees)
18:08 <Hansie>  How big would the committees typically be? I guess that could influence the consensus mechanism.
18:08 <@CjS77>  So therefore, if an AI wanted to run a decentralised digital asset; operated by nodes that it doesn't know or control, then there needs to be a solid consensus and incentive system to maintain the integrity of the DAN
18:10 <@CjS77>  Hansie: The AI will have a say, but I'd imagine most committees would be 1-5 large. But I envisage a long tail with some DAs desiring high censorship resistance requesting around 100 nodes
18:10 <simian_za>  So a users wallet will issue instructions to a single VN in the committee or all of them?
18:11 <Hansie>  Yes, there is the communication angle as well.
18:12 <@CjS77>  IRC  RFC-0172 covers that
18:13  * simian_za reminds himself
18:14 <Blackwolfsa>  But 100 nodes would take long to reach consensus 
18:14  * @CjS77 A quick scan of 172 suggests it hasn't been explicitly stated there
18:14 <Hansie>  I think we can assume the way wallets will communicate with committees will be independent of the consensus mechanism they employ
18:15 <@CjS77>  ^ Yes.
18:15 <Hansie>  So what are the driving forces to pick a consensus mechanism? Could there be a choice?
18:15 <@CjS77>  Yes, there's a definite trade-off between speed and decentralisation Blackwolfsa
18:16 <simian_za>  Hansie: I don't think that is strictly true, consensus protocols often deal with how instructions are distributed/passed around the committee?
18:17 <@CjS77>  Hansie: Off the top of my head, I'd say that we need something with good liveness properties (i.e. is fast), while being resistant to indeterminate errors (i.e. BFT)
18:18 <neonknight642>  The choice of consensus mechanism also affects the amount of communication required, which should be minimized
18:19 <simian_za>  so the goal is to minimize the amount of data passed around and how many rounds it takes to reach consensus which is proportional to latency?
18:19 <Blackwolfsa>  Bft should reduce traffic, but majority should make consensus easier
18:19 <@CjS77>  Which is a different issue to how the initial instruction is presented to the network
18:20 <Hansie>  simian_za: Maybe we can separate delivering wallet instructions/requests from the consensus itself, i.e. assume all/enough VNs know what to do next.
18:21 <@CjS77>  Bandwidth minimisation is a secondary goal imho. Priority is safety and liveness. (Although I suspect high liveness will favour lower bandwidth consumption anyway)
18:21 <simian_za>  Sure, though if the wallet sends the instruction to all VNs simultaneously then there is an element of the consensus algorithm taken care of: All VNs receive an uncorrupted instruction
18:21 <neonknight642>  I don't think a wallet should trust a single VN, but should select a subset of the committee
18:22 <@CjS77>  Though I could send an invalid instruction :)
18:22 <@CjS77>  "Send all teh stuffs to me"
18:23 <@CjS77>  neonknight642: Doesn't the message get sent to any Node first?
18:23 <simian_za>  if one VN would reject that then all of them would. Isn't the consensus algorithm more about how the VNs distribute the instruction they are voting on AND their and any other VNs votes they have collected during the negotiation?
18:24 <Hansie>  One consensus model is for a VN to process instructions as fast as possible and to then communicate their decision/vote to all other committee members. When enough VNs have signed an instruction it becomes valid.
18:25 <neonknight642>  only if the wallet hasn't previously performed the discovery of the committee members, otherwise the wallet can establish direct P2P with them
18:26 <@CjS77>  What you didn't know, simian_za is that the VN is colluding with me and signs it as a valid instruction. So it's up to the other nodes to reach agreement that we're up to shananigans
18:26 <Hansie>  I do not think it would hurt if a wallet knows who all the committee members are and to then send the same instruction to all of them.
18:27 <simian_za>  Man I love me some shenanigans!
18:28 <@CjS77>  Not to get too off-track here, but may I suggest that a PoW blockchain is UNSUITABLE for this task?
18:28 <Blackwolfsa>  Agreed
18:29 <simian_za>  agreed, so what is the main issue? The VNs telling the rest of the committee what their vote is?
18:30 <Hansie>  Something like that, and to get others to agree
18:30 <@CjS77>  Also, tyo decide on the ORDER of competing, incompatible instructions
18:30 <@CjS77>  (these might not even be malicious)
18:30 <simian_za>  ahhh, that's a tricky addition
18:31 <@CjS77>  e.g. If Instruction 1 -> 2 -> 3 is valid, but 2 -> 1 -> 3 would be invalid for some reason
18:32 <Hansie>  We have the checkpoints as well, that can act as some sort of forced syncing of the asset state. Where all validated (i.e. with enough votes) instructions can be confirmed.
18:33 <Hansie>   A little bit of chaos could maybe be allowed between checkpoints.
18:34 <Blackwolfsa>  If the order of 2-1-3 is important, two should point to some order 1 which the vn should see as missing or orphaned
18:35 <@CjS77>  Well, that's the nub of the consensus problem. Everyone needs to agree on the same order
18:35 <neonknight642>  The consensus protocol should also handle cases where a bad VN signs two or more competing instructions and distribute them to different subsets of the committee.
18:36 <@CjS77>  Yup
18:36 <@CjS77>  Only a BFT algo can deal with that (AFAIK)
18:37 <Hansie>  Yes, the instructions (or sequence of instructions) with the BFT threshold votes must win.
18:37 <@CjS77>  Bear in mind that order isn't ALWAYS important. e.g. if 1 -> 2 -> 3 and 2-> 1 -> 3 lead to the same state, we should just let that go
18:38 <Hansie>  Yup
18:38 <@CjS77>  So this leads me to think that a DAG-like structure might be what we're after
18:39 <Hansie>  There, you said it, the magic word
18:40 <Hansie>  Do we know of DAG implementations that are in production?
18:41 <simian_za>  If memory serves the ones described in TLU are described as handling blocks of transactions, do they also work well for single instructions?
18:42 <@CjS77>  Isn't a single tx just a special case of a block?
18:44 <simian_za>  I guess so, let me rephrase: How do they scale in the case of a single TX per block?
18:46 <simian_za>  in terms of how many rounds of comms are required between N nodes for each instruction for all the nodes to end up with the same DAG?
18:46 <Hansie>  Maybe we should consider removing the block layer, just DAG-stitch all valid instructions together, and wait for pending (not yet valid) instructions to mature.
18:47 <@CjS77>  mature = reach consensus?
18:47 <Hansie>  Yup
18:48 <@CjS77>  Yeah -- I mean we _could|_ have "blocks" -- more like small batches of instructions. Esp if tx's are flying in thick and fast (e.g. in a hypothetical future where Big Neon is selling out a Beyonce concert)
18:49 <@CjS77>  But otherwise I agree
18:49 <Hansie>  The batches makes sense yes
18:50 <@CjS77>  And consensus will typically be reached within a few ms. -- depending on the the number of rounds and messages, which is the issue simian_za has been trying to raise :)
18:50 <@CjS77>  For comparison, PBFT has O(n^4) messages
18:50 <@CjS77>  where n is the size of the committee
18:51 <@CjS77>  but linBFT makes some adjustments and reduces that to O(n)
18:52 <@CjS77>  https://arxiv.org/abs/1807.01829
18:52 <simian_za>  Ja, its feels to me that the main metric for evaluating how efficient a given consensus algorithm is when running at the ragged edge (i.e. instructions are validated as quickly as they arrive) is how many rounds of comms there need to be.
18:52 <simian_za>  that sounds promising
18:52 <Hansie>  Agreed
18:53 <@CjS77>  I suspect that any robust system will have at least 3 rounds (ala a Sigma protocol). Anything less than that and my hunch is that there's an attack of some sort lurking. This is just an opinion and isn't backed up by any evidence
18:55 <@CjS77>  linBFT tl;dr:
18:55 <@CjS77>  There's a leader for the round, who submits a hash of the new state to all peers (n messages)
18:55 <Hansie>  On the earlier topic, because we can do not have 'block' to be mined will and 'batch' the instructions, a batch of valid (consented) instructions could be ordered in some special deterministic way to represent a block. So that all VNs can arrive at the same answer.
18:55 <@CjS77>  Nodes, confirms receipt of the hash (n messages)
18:56 <Hansie>  CjS77: Agreed
18:56 <@CjS77>  The leader collects 67% confirmations and again sends out a message to all nodes saying "y'all can commit to this" along with a n-of-m signature proving as much
18:57 <@CjS77>  The nodes reply with "sweet, updated".
18:57 <@CjS77>  The leader responds with "67% have updated, you can finalise", along with a threshold signature
18:57 <@CjS77>  So it's a lot of messages, but O(n)
18:58 <@CjS77>  If there are problems, there are a different set of messages (I just described the happy path)
18:58 <simian_za>  It sounds so easy going, I like it
18:58 <@CjS77>  And there's another optimisation that can reduce things to O(log n) messages that I won't get into
18:59 <neonknight642>  If there is a problem, the wallet can always resubmit the instruction to a different VN in the committee
18:59 <simian_za>  What happens when the leader doesn't perform? A timeout?
18:59 <@CjS77>  Yeah, another leader gets elected after a timeout
18:59 <@CjS77>  But the election is straightforward and takes n messages again
19:01 <@CjS77>  The algorithm is evocatively named "Hot Stuff, the BFT devil"
19:01 <simian_za>  How often does the leader change?
19:01 <@CjS77>  And on that bombshell, it's time to go
19:01 <simian_za>  Should probably just read the paper :)
19:01 <Hansie>  Greatness
19:01 <simian_za>  g'night
19:02 <@CjS77>  simian_za: I'm thinking that the first VN that gets an instruction from "outside" can just take leadership for that instruction.
19:03 <@CjS77>  So leaders potentially change with every instruction
19:03 <simian_za>  then there is just an issue with ordering if two VNs get different instructions at the same time. Does the election process break a tie like that?
19:04 <Hansie>  My thinking as well. Multiple VNs can receive the same instruction almost simultaneously.
19:04 <@CjS77>  We'll have to think that through, because linBFT just describes the algo and doesn't try and use it in a DAG context
19:05 <@CjS77>  Indeex the race condition scenario must be handled without any snags
19:06 <@CjS77>  I suspect, when the hash is sent round, nodes will start to say "I've already seen this one" and not send the receipt
19:06 <@CjS77>  But we must work through the logic very carefully
19:06 <@CjS77>  Perhaps in our next meeting :)
19:07 <simian_za>  Awesome, the leader based method does seem like the way forward to me
19:07 <@CjS77>  My initial aversion to LB methods is abating somewhat; but only because there's a very elegant way to pick a new leader very quickly
19:08 <Blackwolfsa>  But what do we do if you keep attacking the leader? 
19:09 <simian_za>  indeed, the leader needs to be very transient but a mediator simplifies many parts of the protocol
19:10 <Hansie>  What about leaderless BFT protocols?
19:10 <Hansie>  "The protocols differ from the traditional consensus protocols and the Nakamoto consensus protocols by not requiring an elected leader. Instead, the protocol simply guides all the nodes to consensus."
19:11 <@CjS77>  > if you keep attacking the leader?
19:11 <@CjS77>  Implies you're attacking the entire VN
19:11 <@CjS77>  committee
19:12 <@CjS77>  Assuming it takes time to a) Find out the Ip of the new leader and b) switch DDOS resources onto that machine
19:13 <@CjS77>  ^^ I don't know of those assumptions are valid or not. Network experts can advise. fluffypony?
19:13 <@CjS77>  Hansie: Leaderless BFT would be better, but all the ones I've read about are MUCH more complicated
19:14 <@CjS77>  and complexity is the enemy of good (or whatever) )
19:14 <Hansie>  True
19:15 <@CjS77>  lol. My typing. Blackwolfsa -- maybe you should run this transcript through a spell checker before posting as a blog :)
19:16 <Blackwolfsa>  Will do, my vscode has one, will fix spelling 
19:18 <@CjS77>  Here's a reference for HotStuff: https://ui.adsabs.harvard.edu/abs/2018arXiv180305069Y/abstract
19:19 <@CjS77>  Here's a better one: https://arxiv.org/abs/1803.05069