Sharding [ /ˈʃɑːdɪŋ/ ] is a method of partitioning a database horizontally across separate servers to improve scalability, performance and data availability.
In the context of DLTs, sharding refers to the process of dividing the network's computational and storage workload across multiple smaller groups of nodes, called shards, each responsible for processing a subset of the network's transactions and storing a portion of the global state.
The primary goal of sharding in DLTs is to increase the overall throughput and capacity of the network without sacrificing decentralization or security. By allowing multiple shards to process transactions in parallel, sharding aims to overcome the limitations of ‘full replication’ where every node must process and store all transactions.
Sharding has gained significant attention in the blockchain community as a potential solution to the scalability trilemma, which posits that blockchain systems can only achieve two out of three desirable properties: scalability, security, and decentralization. By implementing sharding, projects such as NEAR, MultiversX and Radix aim to maintain high levels of security and decentralization while dramatically improving scalability.
The concept of sharding in DLTs extends beyond simply partitioning data; it encompasses complex mechanisms for ensuring data availability, cross-shard communication, and maintaining overall network consistency.
Sharding is a concept borrowed from traditional database systems and adapted for use in distributed ledger technology.
In traditional database systems, sharding is a method for distributing data across multiple machines. It involves breaking a large database into smaller, more manageable partitions called shards. Each shard contains a subset of the data and is stored on a separate server. This approach allows for horizontal scaling, where additional servers can be added to increase capacity and performance.
There are two main types of database sharding:
In the context of blockchain and distributed ledgers, sharding involves dividing the network's computational and storage workload across multiple smaller groups of nodes, each responsible for processing a subset of the network's transactions and storing a portion of the global state.
The key difference in blockchain sharding is that it must maintain the security and decentralization properties of the network while improving scalability. This involves complex mechanisms for ensuring data availability, cross-shard communication, and maintaining overall network consistency.
Radix, for example, implements a unique approach called "pre-sharding", where the network launches with a maximum number of shards (2^64 or approximately 18.4 quintillion) already in place. This allows for future scalability without needing to change the fundamental structure of the network as it grows.
The primary objectives of implementing sharding in distributed ledgers are: