Paxos and Raft are two well-known consensus protocols that have been around for a long time and remain vital to understanding state machine replication in distributed computer systems. Paxos is actually a family of protocols that rely on a group of differing assumptions depending on the system while Raft is an alternative consensus to Paxos designed to be more understandable.
Comprehending both Paxos and Raft is very helpful in furthering knowledge of how distributed consensus protocols work in cryptocurrencies such as proof of work and practical Byzantine Fault Tolerance.
Background on Paxos and Raft
Paxos was initially proposed in 1989 and distinguished itself as a particularly elegant method of proving safety for fault-tolerant distributed consensus. Despite its initial novelty, Paxos is often viewed as challenging to understand due to its broad assumptions and complex behavior.
Raft was developed as a more understandable alternative to Paxos that essentially is equivalent to Paxos in performance and fault-tolerant guarantees. There are extensive resources available on both Paxos and Raft, and they are studied and used broadly in a variety of applications and systems today.
Raft has several open-source reference implementations in multiple languages including Go, Java, C++, and Rust.
What is Paxos?
Consensus in a distributed fault-tolerant system is agreeing on one result among a group of unreliable participants. Paxos is a family of consensus algorithms that make various trade-offs between assumptions about the processors, participants, and messages in a given system. The protocol guarantees safety and is often employed where the durability of large data sets is required.
Asynchronous consensus protocols cannot guarantee both safety and liveness, so they all come with their own inherent trade-offs. Paxos was one of the first distributed fault-tolerant consensus protocols to guarantee safety and attempts to produce liveness by ensuring that a proposed value is eventually selected by the group of participants in a consensus round.
There are three roles in Paxos consensus, known as agents:
The goal of consensus is for a group of participants to come to an agreement on a single value per each round. A round of consensus begins when a proposer sends a proposed value to a group of acceptors. Acceptors may accept the proposed value by a given proposer, and once a certain threshold is met, then that value is approved by the network.
However, for consensus to work correctly, the first condition of Paxos is:
“Acceptors must accept the first proposed value that they receive.”
This leads to the problem of several proposers sending proposed values that are accepted by acceptors, but all of them accept no majority value since they are accepting the first proposed value. Paxos solves this by uniquely indexing each proposed value that an acceptor receives which allows them to accept more than one proposal.
A unique number defines each proposal, and the network selects a value once a specific proposed value is accepted by the majority of acceptors, known as the chosen value. Multiple proposals can be chosen, but it is necessary to validate the safety property by guaranteeing that these proposals all have the same value. As per Leslie Lamport’s definition of the required second condition of Paxos that ensures safety:
“If a proposal with value v is chosen, then every higher-numbered proposal that is chosen has value v.”
Communication in the network is asynchronous, so it is possible that certain acceptors have not received the chosen value, which is fine as long as conditions 1 and 2 are not violated.
Proposers employ certain restrictions as messages to sets of acceptors along with the values. These are called prepare requests and contain 2 primary requests:
- Promise never to accept a proposal less than n (n is the new proposal number)
- Respond with the proposal with the highest number less than n that the acceptor has accepted.
According to Lamport:
“If the proposer receives the requested responses from a majority of the acceptors, then it can issue a proposal with number n and value v, where v is the value of the highest-numbered proposal among the responses or is any value selected by the proposer if the responders reported no proposals.”
Proposers subsequently send an accept request which is acknowledged by the acceptors. The proposer then sends a commit message to the acceptors who can either ignore (without compromising safety) or indicate the success of the value commit. Once a certain threshold of acceptors has committed the value, then the protocol for that consensus round terminates and externalizes the value.
The intricate design of Paxos is that it can accept values when a majority of nodes agree despite other nodes ignoring or denying a proposed value. This differs from previous iterations of consensus that required all nodes to agree and were subject to blockage of the protocol from the failure of single nodes.
As long as the proposal numbers are unique, Paxos can select a value that guarantees safety. It is important to note that an acceptor only needs to remember the highest numbered proposal it has accepted. Conversely, a proposer can always abandon a proposal as long as it does not reissue a proposal with the same unique number.
Breaking down the roles of the proposer and acceptor in the protocol is as follows:
- Submit proposal n to acceptors along with prepare request, wait for a majority to reply.
- If the majority of acceptor reply they agree, they will reply with the agreed value. If majority reject, then abandon and restart the process.
- Proposer subsequently sends a commit message with n and value if the majority accepts.
- If the majority of acceptors accept the commit message, protocol round completes.
- Receive proposal and compare it to the highest numbered proposal already agreed to.
- If n is higher then accept the proposal, if n is lower then reject the proposal.
- Accept subsequent commit message if its value is the same as a previously accepted proposal and its sequence number is the highest number agreed to.
Proposals can make multiple proposals but need to follow the algorithm for each proposal individually.
Finally, the role of the learners is to discover that a majority of acceptors have accepted a proposal from the proposers. A distinguished learner is selected that propagates the chosen value to the other learners in the network. Variations of this process can be used where either all acceptors inform corresponding learners of their decisions or acceptors respond to a distinct set of learners who then propagate the message to the rest of the learners.
Formally, the Paxos algorithm distinguishes a leader (proposer) for each round that is required to make progress. Acceptors can acknowledge the leadership of a proposer which allows Paxos to be used to select a leader within a cluster of nodes. Paxos may stall if two proposers are competing for the leader position with no agreement on which one is the leader, however. It is unlikely that this state of non-termination will persist though.
What is Raft?
Raft was created as a more understandable version of Paxos with the same fault-tolerance and performance guarantees. Raft also improves on building practical implementations of protocols on top of it. Due to Paxos’ complexity, it is not useful for providing a solid foundation to develop on top of. Raft is similar to Paxos, so comparing the two requires a brief breakdown of how Raft simplifies the Paxos process.
Raft employs a leader and follower model based on the assumption that a cluster of nodes only has one elected leader. The leader manages the log replication across the participating nodes and is replaced once it fails or disconnects.
A leader is also elected when the algorithm begins. To give leader selection some context, it plays a vital role in consensus and is distinguishable in specific algorithms. For instance in Nakamoto Proof of Work, leader selection is achieved through the lottery-like mining process for each round, which is approximately every 10 minutes. In Practical Byzantine Fault Tolerance (pBFT), leader selection is performed through a round-robin style format.
Raft selects the leader through a process initiated by a candidate node. If candidates do not receive communication during a phase known as the election timeout, then they vote for themselves after increasing their term-counter and broadcast it to the other nodes. Candidates become followers of other candidates who have a term number at least as large as theirs, and this ripple effect continues among the nodes until one candidate receives a majority of followers.
The leader controls log replication among the nodes where it sends the client request commands to its followers. If a majority of followers confirm replication, then the request is committed. Followers also apply the commits to their local state machines.
Raft retains fault-tolerance from nodes subject to failure or a leader failure by having a new leader force its followers to duplicate its own logs. Any entries that do not agree with each other are deleted, maintaining consistency of log replication.
Leader candidates are required to have a more up-to-date log than follower logs. If a candidate’s log is less up-to-date than a potential follower (a voter in this context), then the candidate is rejected.
Overall, Raft deconstructs consensus into 3 individual sub-problems:
- Leader Election
- Log Replication
The consensus protocol uses a strong leader, meaning that the leader node in Raft exerts substantial influence on the process while remaining restricted by the confines of the protocol. As a result, Raft is more straightforward in design than Paxos.
Paxos and Raft are important consensus protocols that are core components of the larger distributed fault-tolerance ecosystem. While not directly employed in cryptocurrencies, the consensus protocols used in cryptocurrency networks derive many of their characteristic assumptions from the design of both Paxos and Raft.