Horizontal scaling (or scaling out) is the process of increasing database capacity by adding more servers instead of upgrading a single server. In MongoDB, sharding is the method used for horizontal scaling.
Implemention Of Horizontal Scaling
Follow the steps below to implement sharding in MongoDB successfully.
Step 1: Set Up the Environment
Before starting, ensure you have multiple servers available to act as shards, config servers, and query routers.
Step 2: Install MongoDB of All Servers
Install MongoDB on all servers that will act as shards, config servers, and query routers.
Step 3: Configure Shards
Start each MongoDB instance that will be a shard with the appropriate configuration. For example:
mongod --shardsvr --port 27018 --dbpath /data/shard1 --logpath /var/log/mongodb/shard1.log --forkRepeat the process for all shard servers, ensuring each one is configured correctly.
Step 4: Configure Config Servers
Start the config servers with the appropriate configuration. Ensure you have at least three config servers for redundancy.
mongod --configsvr --port 27019 --dbpath /data/config --logpath /var/log/mongodb/config.log --forkRepeat the process for all config servers.
Step 5: Start the Query Routers (Mongos)
Start the Mongos instances to act as query routers:
mongos --configdb configReplSet/localhost:27019,localhost:27020,
localhost:27021 --logpath /var/log/mongodb/mongos.log --forkEnsure the configDB setting correctly points to the config servers.
Step 6: Connect to the Mongos
Connect to the mongos instance using the MongoDB shell:
mongo --port 27017Step 7: Add Shards to the Cluster
Within the MongoDB shell, add each shard to the cluster:
sh.addShard("shard1/localhost:27018")
sh.addShard("shard2/localhost:27019")
sh.addShard("shard3/localhost:27020")Step 8: Enable Sharding for a Database
Enable sharding for a specific database:
sh.enableSharding("myDatabase") Step 9: Shard a Collection Using a Shard Key
Shard a collection within the database by specifying a shard key. Choosing the right shard key is critical for even data distribution and query efficiency.
sh.shardCollection("myDatabase.myCollection", { shardKey: 1 })Why Horizontal Scaling is Needed
- Handle Large Data: Efficiently manage large datasets and high-traffic workloads.
- Prevent Bottlenecks: Avoid overloading a single server by distributing data and queries.
- Improve Performance: Reads and writes are distributed across multiple servers.
- High Availability: Combined with replica sets, sharding ensures fault tolerance.