Sharding [ /ˈʃɑːdɪŋ/ ] is a method of partitioning a database horizontally across separate servers to improve scalability, performance and data availability.

In distributed ledgers (DLTs) like Radix, sharding is used to allocate both data storage and transaction execution across a decentralized network of nodes to achieve a high transactional capacity.

https://youtu.be/u0GyEYvK7EI

Background

Traditional blockchains like Bitcoin and Ethereum rely on every node in the network processing and storing every transaction. This provides strong decentralization and security by avoiding reliance on any trusted parties. However, it fundamentally limits the throughput of the system to what a single node can validate, resulting in poor scalability.

Sharding aims to transcend this trilemma by creating a mechanism where nodes only need to store and process a subset of the total transactions, known as a ‘shard.’ By splitting up the workload in this manner, sharding enables distributed ledgers to parallelize the processing of transactions across shards, potentially increasing the throughput of the network quadratically compared to a non-sharded system.

Sharding in Radix

Radix has developed an integrated sharding and consensus architecture specifically designed for hyper-scalability of its decentralized network. In Radix’s case, sharding applies to both data availability and transaction execution as both functions are performed by nodes.

Ledger Pre-Sharding

The current Radix Mainnet (Babylon) is sharded into a fixed number of 2^256 shards. Responsibility for validating shards is undertaken by groups of validators called shard groups, which may grow or shrink dynamically in response to load demand. Currently, the number of shard groups is capped at one but this will be lifted with Radix’s forthcoming Xi’an release.

Pre-sharding is in contrast to the dynamic adaptive state sharding model adopted by Shardeum, MultiversX, and NEAR, where shards are added incrementally as required. While sharding can improve scalability, an ad hoc approach to sharding leads to substantial difficulties as any changes to the shard structure require reorganizing the entire network - a time consuming and expensive process. The larger the sharded ledger grows, the more problematic this becomes. Ad hoc sharding also complicates queries and data lookups within the ledger. By sharding the data randomly, it becomes much harder to locate specific transactions or data points since they could be stored anywhere. This slows down queries as more extensive searches are required.

Deterministic Shard Indexing

Shards on Radix are indexed deterministically by public keys. This means that the shard index for any address can be calculated by taking the modulo of the public key over the shard space.

$$ \Large s_i = \frac {\mathrm{mod}~p_i}{S} \qquad \footnotesize \qquad \begin{array}{l} s = shard~index \\ p = public~key\\S = total~ shard~space \end{array} $$

By deterministically grouping related data into the same shard, Radix avoids the need for expensive data reorganization as the network grows. This creates four major advantages:

  1. Proximity: All transactions from a particular account are guaranteed to be in the same shard, which makes it trivial to identify attempted double-spends.
  2. Asynchrony: Transactions from separate accounts will always involve separate shards, enabling asynchronous, parallel processing of unrelated transactions.
  3. Indexing: Lookup complexity and query time are reduced since shard locations can be easily derived from public keys.
  4. Load balancing: Hash sharding typically results in a more uniform distribution of data across nodes.

Network Security

A key challenge in sharding distributed ledgers is ensuring sufficient security and node coverage across all shards. If some shards have much fewer nodes than others, it creates vulnerabilities. Radix employs several techniques to maintain security across its sharded network:

  1. Node Identity Shard Mapping: To secure the network, validator node addresses are mapped to a single ‘root’ shard. Nodes must permanently maintain their root shards, but can support additional shards to earn more transaction fees. Underserved shards offer higher returns, attracting more validators and preventing any shards from being overlooked. This free market approach maintains security even as the network scales.
  2. Incentives for Multi-Shard Validation: Based on factors like computing resources, validators can choose to support additional shards beyond their root shard. The more shards a node supports, the greater the amount of transaction fees it can earn. This creates an incentive for validators to support as many shards as feasible to maximize profits. In this way, the overall validation workload is distributed across nodes.
  3. Dynamic Shard Support via Free Market: As the network grows, some shards may end up with fewer nodes supporting them compared to other oversubscribed shards. These underserved shards then inherently offer higher potential returns since there is less competition for fees. The higher relative profits attract more validators to begin supporting the underserved shards. This brings coverage back into equilibrium across shards through a free market approach.
  4. Scaling Security Through Staking: In proof-of-stake networks like Radix, staking provides additional security. The more tokens a validator stakes, the more shards it can validate. This allows validation load to scale up securely. High stake validators may validate transactions across many shards in parallel for efficiency. However low stake nodes still play a key role in providing decentralized shard coverage.

Together, these mechanisms ensure Radix can securely scale to an exponentially growing shard space without running into coverage gaps or centralization issues. The network organically self-regulates to distribute validation across shards.

Cerberus Consensus

Main article: Cerberus