Running full Bitcoin clients is a vital component of the network’s sustainable decentralization and a useful representation of the network’s healthy adoption. Operating full nodes is not a convenient process for mainstream users to access Bitcoin, however.
Full nodes need to sync with the entire Bitcoin blockchain (~200 GB) and connect to multiple peers for relaying transactions broadcast across the network. Standard full nodes connect with 8 peers, and transaction relays account for a vast portion of the bandwidth required for managing a full node.
Gregory Maxwell, a leading Bitcoin developer and co-founder of Blockstream, estimated that transaction relays account for roughly 87 percent of full node bandwidth requirements.
Additionally, full node syncs can take several days to complete, and involve some basic technical knowledge of Bitcoin. The often cumbersome process of running a full node is a substantial barrier to adoption that largely prevents many mainstream users from launching a full client.
Several developments have improved the process of launching and operating a full client, such as Casa’s hardware node and Pierre Rochard’s node launcher — which includes LN compatibility with Zap and Joule. However, reducing the burden on full node operators is a prudent long-term venture and is the focus of several improvements to Bitcoin, including MiniSketch.
MiniSketch is a proposed method for ‘set reconciliation’ of mempool sets between nodes in the network, spearheaded by Peter Wuille, Gregory Maxwell, and Gleb Naumenko.
Syncing Between Nodes
Before diving into MiniSketch, it is relevant to address the syncing process between nodes and the background of set reconciliation.
Set reconciliation is a process in computer science where sets of data settle (i.e., reconcile) the differences between their sets of data to converge on precise copies. Maxwell described the process in a piece by Bitcoin Magazine as akin to syncing phone contact lists between two people who share many of the same contacts.
“You could send them your whole list but it won’t fit on a postcard and would be pretty wasteful in any case, since they already know most of the contacts … It is possible, in fact, to communicate your whole set of contacts to them by sending only as much information as the size of the difference between your lists even without any idea in advance of what the actual differences are.”
Reconciling differences between data sets of different computers require bandwidth to cross-reference the specific discrepancies between the two sets and converge on the identical copy. Algorithms for improving set reconciliation provide more efficient paths for reconciling differences between data sets, which reduce the bandwidth requirements.
MiniSketch in Bitcoin is an implementation of the PinSketch BCH-based secure sketch algorithm. BCH stands for ‘Bose-Chaudhuri-Hocquenghem’ and is codes used for cyclic error correction in computer programming and deployed in applications like satellite communications.
In Bitcoin, MiniSketch implements PinSketch to optimize the distribution of transactions in the network, enabling full clients to connect to more peers with reduced bandwidth requirements.
The data sets being reconciled in Bitcoin are the transactions received and relayed by peer nodes. Most nodes contain many of the same transactions, but the order they are received sometimes causes discrepancies, which delays the syncing of data between their mempools and drives bandwidth usage higher.
Nodes in the Bitcoin network broadcast transactions via the networks gossip protocol, known as diffusion. The goal is to relay transactions across the network to a majority of nodes very rapidly. This leads to inconsistencies in the order of transactions within mempools compared to recently synced blocks.
Minisketch
MiniSketch is designed to enhance the set reconciliation process by presenting a more efficient mechanism for node mempools to sync and pass only necessary data between them rather than the entire data sets.
Nodes waste substantial bandwidth discerning which nodes need to receive which transaction data for the network to be in sync as transactions are picked up by miners from the mempool. MiniSketch enables nodes to cross-reference the data via an algorithm based on only the data that occurs in one set but not the other.
Typically, the exchange of data between nodes focuses on referencing the entire mempool data sets. MiniSketch permits a much more compact sync (reconciliation) of transaction mempool sets by sketching the differences between data sets via ‘set checksums.’
Set checksums have a predetermined capacity and can be used to sketch the symmetric difference between two sets of data. For instance, if Alice and Bob want to reconcile their node transaction sets, they can use MiniSketch to compute a sketch of the elements within their data sets.
One of the parties, let’s say Bob, measures the symmetric difference between the two data sets which is akin to finding a precise number of differences between various types of data sums. However, Bob is only seeking to recover the different data in Alice’s sketch from his sketch. He then sends the differences to Alice, and they can both reconcile their transaction sets much more efficiently.
According to the MiniSketch Github ReadMe file:
“This will always succeed when the size of the difference (elements that Alice has but Bob doesn’t plus elements that Bob has but Alice doesn’t) does not exceed the capacity of the sketch that Alice sent. The interesting part is that this works regardless of the actual set sizes—only the difference matters.”
According to Maxwell in the Bitcoin Magazine piece, MiniSketch could allow for a potential reduction in node transaction relay overhead by 40X, which is what their simulations indicated.
The two primary advantages of MiniSketch are:
- Node bandwidth reductions.
- The ability of nodes to connect to more peers.
The bandwidth reduction using MiniSketch is evident and can be used for more efficient block propagation in low-bandwidth satellite links.
The diminished bandwidth burden per node would also enable nodes to connect with more peers than usual — such as 16 rather than 8. Other advantages and applications of MiniSketch, as cited in the Github repo, are its potential combination with dc-nets for ‘cryptographic multi-party anonymous communication,’ and helping to extract a cryptographic key from ‘fuzzy’ biometric data.
Notably, MiniSketch is also optional for node operators as it is not part of Bitcoin’s consensus requiring nodes to upgrade to the latest core specification. Instead, operators could choose to run the protocol with others to increase their bandwidth efficiency.
A formal BIP for MiniSketch is not yet available, and a future proposal may be integrated with another protocol known as ‘Invertible Bloom Lookup Tables’ for enhancing block propagation. The optional existence of MiniSketch also makes it less prone to being backlogged with other formal consensus change proposals to the legacy cryptocurrency.
The overall advantages of MiniSketch are compelling for node operators as it lowers the barrier to operating a full node and makes it more efficient. More full Bitcoin clients mean more robust decentralization and a healthier network.