What are some strategies for database sharding?
A sharding strategy is necessary to determine which record goes into which shard:
Key-Based: One of the columns, called shard key, is put through a hash function. The result determines the shard for that row. Also called hash-based sharding or algorithmic sharding.
Range-Based: For example, given a product-price database, prices in range 0-49 go into shard 1, 50-99 into shard 2, and so on. Price column is the shard key. If the store sells lot more low-value products, this will result in unbalanced shards and hotspots.
Dictionary-Based: A lookup table is used. This is a flexible approach since there’s no predetermined hash function or ranges. An example is to shard by customer’s region such as UK, US, or EU.
Hierarchical: A combination of row and column is used as the shard key. Previously mentioned sharding strategies can be used on the key.
Entity Groups: To facilitate queries across multiple tables, this strategy places related tables within the same shard. This brings stronger consistency. For example, all data pertaining to a user reside in the same shard.