Content
Introduction
In this short tutorial we will look at how you can set up a MongoDB replica set.
First we give a little introduction to replica sets, after which we show how to create a replica set.
In a next tutorial we show how to enable sharding building on this example.
In our example each mongo instance will be running in a seperate container as to support the fault tolerant requirement.
Furthermore we try to minimize command line arguments by moving command line to shell scripts, this way starting new replica sets requires less steps, however it also means we create some assumptions about the naming and amount of instances in a replica set.
Although most steps are explained step by step, minor docker and database knowledge is required.
WARNING:
This example is not intended for production environments, no mind has been given to security concerns.
Mongo Replica set
To make your application more fault tolerant you can create a replica set. A replica set is a collection of MongoDB instances that contain copies of the same data. Instead of accessing one specific instance, you would then access the replica set, which would then get data from the current primary instance. Whenever the primary instance in the replica set becomes unavailable the other instances vote for a new primary from which data can be accessed. An example of this can be seen in the figure below.
Important for the voting process is that there are at least 3 voting instances, as with 2 instances one instance cannot decide which instance is failing. Next to primaries and secondaries a third type of instance exists, the arbiter, this instance does not contain data and thus cannot become primary however does participate in the voting process. An example can be seen in the figure below.
More detailed information on replica sets and how to set them up can be found at the MongoDB offical website
Setting up docker containers
To run a replica set first we need to setup a container for each mongo instance. Setting up a container with Mongo is trivial as we can pull a MongoDB image from docker hub. However running a Mongo container for each instance is not enough, as we want to setup our replica set autmatically we also create an init container, which will setup the replica setup on start. For the initialization we will have to create our own image. Below is the content inside our Dockerfile.
FROM mongo
ADD init.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/init.sh
So lets explain each line shortly:
/usr/local/bin/
folder. (explained later)
In this example we startup the containers using docker-compose, so we create a docker-compose.yaml. Below is an example of what this could look like till now.
As we can see there are 3 mongo instances which will be running as primary, secondary or arbiter. rsinit will be our intialization script.
Most important to focus on is that we are using the docker-compose command argument to add some parameters to the default start command,
--wiredTigerCacheSizeGB 0.5 --bind_ip_all --port 27017 --replSet rs0
In this example we use the default mongo port 27017 for each instance, and for simplicity bind all ports.
Further important to notice is that all is instances are part of replSet rs0.
Creating a replica set
Ok so now we have our mongo instances running, how do we make them a replica set? This is where our init script comes in.
#!/bin/bash
echo "prepare rs initiating"
check_db_status() {
mongo1=$(mongo --host mongo --port 27017 --eval "db.stats().ok" | tail -n1 | grep -E '(^|\s)1($|\s)')
mongo2=$(mongo --host mongo-2 --port 27017 --eval "db.stats().ok" | tail -n1 | grep -E '(^|\s)1($|\s)')
mongo3=$(mongo --host mongo-3 --port 27017 --eval "db.stats().ok" | tail -n1 | grep -E '(^|\s)1($|\s)')
if [[ $mongo1 == 1 ]] && [[ $mongo2 == 1 ]] && [[ $mongo3 == 1 ]]; then
init_rs
else
check_db_status
fi
}
init_rs() {
ret=$(mongo --host mongo --port 27017 --eval "rs.initiate({ _id: 'rs0', members: [{ _id: 0, host: 'mongo:27017' }, { _id: 1, host: 'mongo-2:27017' }, { _id: 2, host: 'mongo-3:27017' } ] })" > /dev/null 2>&1)
}
check_db_status > /dev/null 2>&1
echo "rs initiating finished"
exit 0
So lets explain the script step by step.
check_db_status > /dev/null 2>&1
we call our check_db_status function and send the output to /dev/null
rs.initiate({ _id: 'rs0', members: [{ _id: 0, host: 'mongo:27017' }, { _id: 1, host: 'mongo-2:27017' }, { _id: 2, host: 'mongo-3:27017' } ] })
on one of our mongo services (in this case service with name mongo). This command will setup the replica set given its members and the name of the replica set, which to match with our docker-compose has to be rs0.
And that's it! We have our replica set running, one of our instances is primary and the other secondary, to check this connect to one of the containers, open the mongo shell with mongo
and run rs.status()
. To access the replica set give all possible options of members in the url, e.g. mongodb://mongo:27017,mongo-2:27017,mongo-3:27017/?replicaSet=rs0
Next: What is sharding?
In case of very large databases, it might be become unsustainable to have all data in a single location. With sharding data is divided over different 'shards', making it possible to have way bigger data sets which also could still be replicated. To see how to create a sharded MongoDB cluster see Setting up a sharded mongo cluster
More detailed information about sharding and how to set them up can be found at the MongoDB offical website