October 19, 2022

By Cayle Sharrock

Developer Update

A few posts ago, I spoke about our intention to pivot the DAN’s consensus algorithm from independent side-chains towards braided Hotstuff sharding (aka Cerberus)].

In that post I hinted at how building a Cerberus L2 on top of our Mimblewimble L1 offers some advantages over the proof-of-stake implementation that Cerberus’ inventors are building. It’s time to expand on some of those thoughts.

Proof-of-work first

There is a deep-seated conviction within the Tari development community that proof-of-work is the best way we know of to secure decentralised permissionless money. Our scepticism of proof-of-stake revolves around these key points:

This doesn’t mean that PoS is useless. But for securing my money? I believe that the history books will mark PoS L1s alongside MMT as the most catastrophic financial inventions of the early 21st century. I could be wrong. Only time will tell.

Cerberus as a Layer 2

Cerberus is a BFT consensus algorithm. This means that as long as 67% of the network is honest, decisions made by the network are completed with 100% finality and there is no need for proof-of-anything.

Of course, the problem arises when the network is in Byzantine conditions, i.e. there is some issue, malicious or accidental, that prevents more than a third of the network from cooperating with its peers.

In the best case, the network cannot make progress, because nodes cannot prove to themselves that there’s sufficient agreement on what the correct outcome of any decision is. This is what is known as a liveness failure. BFT algorithms can provide safety guarantees, meaning that while the network is healthy, we can be certain that every decision is correct, and is the same decision that would be made by any honest node. But they cannot guarantee liveness, that the network will always be able to make progress.

In the worst case, a super-majority of malicious nodes can rewrite the entire history of the network at will. They can do this at essentially zero cost too, because writing information to a file is very, very cheap, and this is all that is required.

So we need to deal with these two cases. Generally, pure proof-of-stake networks cannot provide liveness guarantees at all. We’ve seen this occasionally with EOS, Solana and I presume others, where the network gets stuck, and it’s literally resolved via a Zoom call with key node operators to reach a social consensus on how to restart the network.

The worst case would be “solved” in the same way; although here’s where things get tricky. If one assumes that the social consensus operators are themselves the super-majority of the consensus algorithm (otherwise, why are they on this Zoom call?), then what incentive would they have to revert to the “correct” history?

As far as I know, Radix are planning to avoid using social consensus in their version of Cerberus by using a sort of self-healing mechanism that temporarily reverts to proof-of-work if the system detects a liveness failure. It’s kinda cool, but also pretty complicated.

The takeaway is that if you’re going to be running deterministic consensus (i.e. a BFT algorithm), you 👏 must 👏 have 👏 a backup 👏 consensus algorithm 👏 to guarantee 👏 liveness.

Now with the decision to pivot to Cerberus, we’re thinking, “Gosh where are we going to find a reliable backup consensus algorithm at this time of night?”

/assets/img/posts/cerbdan.jpg

Tari Mimblewimble as Layer 1

Have you ever wondered why BFT algos require 67% node honesty, but Bitcoin only needs 50.1%? There’s no paradox here. It’s because Bitcoin does not use a BFT consensus mechanism. Nakamoto consensus is a probabilistic consensus algorithm. Decisions are never 100% final (as with BFT algorithms), but they approach finality asymptotically with every block that’s added to the chain.

Another property of proof-of-work-based Nakamoto consensus is that liveness is guaranteed. It’s not a very strong guarantee, mind you. Even if 99% of the hash rate had to disappear, the Bitcoin chain would still grind forwards – at an absolute crawl – but it would still make progress.

Tari’s Mimblewimble base layer is built on top of a proof-of-work-based Nakamoto consensus algorithm, so it has all the liveness-guarantee goodness we’re looking for to use to keep Cerberus marching onward and upward.

You can think of Tari’s base layer as this giant, ponderous pendulum that keeps the super-nimble but (possibly) prone-to-breakdown Cerberus ticking along.

pedulum

In practice this would work something like this:

  • Cerberus is sharded, and each validator node covers a fraction of the total shard space.
  • This shard group, as it is called, is determined for each node by the base layer.
  • Therefore every validator node will have to consult a Tari base node to find out what their shard group is. (They also have to register on the base layer, but more on this later).
  • The base node periodically shuffles the shard groups, meaning that affected validator nodes have to reset, update and manage a different part of the shard space (There will be an RFC describing this mechanism and the myriad edge cases shortly).
  • Let’s say that some validator group is byzantine. Only the instructions that involve the state covered by the byzantine group will be affected, but it might be a significant number; especially if there are still a small of validator nodes in the network.
  • The affected instructions get stuck and cannot be resolved.
  • Eventually, the base layer shuffles the nodes, and enough bad nodes are shuffled out and replaced by good ones that the node group is healthy again and can continue to process instructions as normal.

Sybil resistance

Cerberus is fairly simple to understand if there’s a one-node, one-vote arrangement. Proof-of-stake complicates things quite a bit. Now it’s one piece of stake, one vote. What if a whale node finds itself amongst a school of minnows? Won’t it have a two-thirds majority and just be able to unilaterally decide on instruction outcomes? The short answer is yes.

So to make proof-of-stake work in this context, you need to build in all sorts of safeguards to make sure that no one node dominates their group and to shuffle them out if they do. But you also need a minimum number of nodes in any given group, and you don’t want the same nodes always playing together in case they collude, and so on.

Pretty soon, you have quite a fun combinatorial optimization puzzle to solve.

We can neatly side-step this problem by leveraging the base layer once again.

First, we require all validator nodes to register on the base layer. This takes the form of a special transaction that locks up some amount of XTR along with some other metadata in a VN_REGISTRATION UTXO. This achieves a couple of things:

  • It costs real money and opportunity to become a validator node, which provides Sybil resistance. I propose that this Tari is locked up for a minimum of 1-3 months to make it expensive to constantly register and deregister to try and hide your identity as a naughty node.
  • Now we can run a simple one-node, one-vote system on the Cerberus layer.
  • One can query a base node for a list of all VN_REGISTRATION UTXOs to generate a list of all possible validator nodes and their metadata.
  • Anyone can pull this list, and figure out which nodes manage which shard groups at any given time.

As a near-immutable append-only database, the base layer is also the perfect vehicle for referencing additional global data the DAN needs to run smoothly, including smart contract templates.

Summary

In short, having a proof-of-work base layer confers several advantages over a pure proof-of-stake system:

  • The DAN as a whole can provide (weak) liveness guarantees. We don’t need an additional backup consensus algorithm in the DAN.
  • A simple one-node, one-vote consensus mechanism works perfectly. We do not need complicated validator group management algorithms,
  • Using the base layer as a registrar, we have a very convenient Sybil resistance and registration mechanism for validator nodes, smart contract templates and more.

The more we work on this, the more we discover how synergetically the Tari base layer and Cerberus DAN work together to bring a permissionless, decentralised digital assets platform to the world.