Skip to main content

Setup MongoDB Replica Set via Docker Compose

This guide explains how to spawn a MongoDB replica set using Docker Compose for unit tests. It also covers instructions for ingesting sample data, running tests, and verifying the setup.

The setup uses the /feat/test_framework branch. Ensure you have checked out this branch as described below.

Clone the feat/test_framework branch:

git clone -b feat/test_framework https://github.com/datazip-inc/olake.git

To update your branch with the latest changes:

git pull origin feat/test_framework

If you already have master olake branch, perform the following steps:

# Fetch all the remote branches
git fetch --all

# Verify that feat/test_framework exists in the remote repo, press `q` to quit.
git branch -r

# This creates a local branch named feat/test_framework and tracks the remote branch origin/feat/test_framework.
git checkout -b feat/test_framework origin/feat/test_framework

# You should see feat/test_framework as the active branch (* feat/test_framework).
git branch

# [optional] To ensure your local branch is up-to-date
git pull origin feat/test_framework

docker-compose.yaml Configuration

This section describes the configuration required to start MongoDB as a replica set inside a Docker container.

drivers/mongodb/docker-compose.yaml
version: "3.7"

services:
db:
image: 'mongo:latest'
environment:
MONGO_INITDB_ROOT_USERNAME: "olake"
MONGO_INITDB_ROOT_PASSWORD: "olake"
command: "mongod --bind_ip_all --replSet rs0 --keyFile /etc/ssl/sample.key"
entrypoint:
- "bash"
- "-c"
- |
chmod 400 /etc/ssl/sample.key;
chown 999:999 /etc/ssl/sample.key;
exec docker-entrypoint.sh "$@"
ports:
- "27017:27017"
networks:
- olake-test
volumes:
- ${PWD}/drivers/mongodb/test.key:/etc/ssl/sample.key
healthcheck:
test: ["CMD", "mongosh", "--eval", "db.adminCommand('ping')"]
interval: 5s
timeout: 5s
retries: 3
start_period: 5s

networks:
olake-test:
name: olake-test
external: true

docker-compose-init.sh

#!/bin/bash
function mongosh() {
echo "$@" | docker exec -i mongodb-db-1 mongosh -u olake -p olake admin
}

# Initialize the replica set as a single-node setup
mongosh 'rs.initiate()'
sleep 3

# Update replica set configuration to use localhost:27017
mongosh 'cfg = rs.conf(); cfg.members[0].host="localhost:27017"; rs.reconfig(cfg);'

This function runs MongoDB shell (mongosh) commands inside the Docker container. It executes commands on the mongodb-db-1 container using docker exec.

  • The -u olake -p olake admin flags mean it connects to the MongoDB admin database using the username olake and password olake.
  • mongosh 'rs.initiate()'initializes a new MongoDB replica set. By default, rs.initiate() configures the instance as a single-node replica set. A replica set allows MongoDB to support failover and high availability.
  • sleep 3 → This pauses execution for 3 seconds to ensure MongoDB completes the initialization before reconfiguring the replica set.
  • rs.conf() → Fetches the current replica set configuration.
  • cfg.members[0].host="localhost:27017" → Updates the hostname of the first (and only) replica set member to localhost:27017.
  • rs.reconfig(cfg) → Applies the new replica set configuration.

After running the script, your MongoDB instance inside Docker:

  • Becomes a single-node replica set
  • Uses localhost:27017 as its hostname
  • Is ready for connections as a replica set-enabled database

Running the Replica Set

1. Create the Docker Network

To run Mongo Unit tests, start the docker compose and initiate it:

Before starting the containers, create the external network:

docker network create olake-test    

Sample output:

ef0000000x0x0253141909abff6744fcf4c26d8f0e56865cc364690a97289b0f

2. Start the Docker Container

Run the following command to start MongoDB:

docker compose -f ./drivers/mongodb/docker-compose.yaml up -d  

Sample Output: docker compose 1

3. Initialize the Replica Set

Execute the initialization script:

bash ./drivers/mongodb/docker-compose-init.sh  

Expected outputs: Initialization confirmation (sample images):

docker compose 2

docker compose 3

MongoDB Unit Tests

Once the replica set is running, you can execute MongoDB unit tests. For example:

go test -tags -v ./drivers/mongodb/internal                  

To stop the MongoDB instance, run:

docker compose -f ./drivers/mongodb/docker-compose.yaml down

Updated Source Configuration Example (config.json):

config.json
{
"hosts": ["localhost:27017"],
"username": "olake",
"password": "olake",
"authdb": "admin",
"replica-set": "rs0",
"read-preference": "secondaryPreferred",
"srv": false,
"server-ram": 16,
"database": "reddit",
"max_threads": 50,
"default_mode": "cdc",
"backoff_retry_count": 2
}

Ingest / Load sample JSON Data into MongoDB

Importing Sample JSON Data Directly from an API

We have provided some sample datasets for you to test OLake. You can fetch the JSON from the URL and insert it into MongoDB without saving it as a file:

Run This One-Liner in Your Terminal

curl -s "https://www.reddit.com/r/funny.json" | \
jq '.data.children | map(.data)' | \
docker exec -i mongodb-db-1 mongoimport -u olake -p olake --authenticationDatabase admin --db reddit --collection funny --jsonArray

What This Does?

  • curl -s "https://www.reddit.com/r/funny.json" → Fetches the Reddit JSON data.
  • jq '.data.children | map(.data)' → Extracts only the children array and keeps only the data field.
  • docker exec -i mongodb-db-1 mongoimport ... → Runs mongoimport inside the container.
  • --db reddit → Inserts data into the reddit database.
  • --authenticationDatabase admin → Ensures authentication happens in the admin database.
  • --collection funny → Stores the data in the funny collection.
  • --jsonArray → Assumes the JSON is an array of objects (useful if the API returns one).

Verify the Data in MongoDB

Exec into the MongoDB Shell

docker exec -it mongodb-db-1 mongosh -u olake -p olake

Switch to the Database

use reddit

Check the Collection

db.funny.find().limit(5).pretty()

This will show a formatted output of the first 5 documents.

Verifying Your Setup

Use the following command:

docker exec -it mongodb-db-1 mongosh -u olake -p olake --eval "show dbs"

Remove --eval "show dbs" if you just want to exec into the container.

You would see something like:

Current Mongosh Log ID:	67b701a82xxxxf7925a00aa0
Connecting to: mongodb://<credentials>@127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+2.3.8
Using MongoDB: 8.0.4
Using Mongosh: 2.3.8

For mongosh info see: https://www.mongodb.com/docs/mongodb-shell/

rs0 [direct: primary] test> show dbs;
admin 140.00 KiB
config 80.00 KiB
local 436.00 KiB

OLake Integration

After verifying MongoDB’s configuration, proceed with OLake’s integration steps.

info

You can then proceed to getting started guide for more information.

Below are sample commands you can use:

info

Refer the following docs to verify your configurations.

  1. config.json
  2. catalog.json
  3. writer.json
  4. state.json

1. Discover

./build.sh driver-mongodb discover --config /Users/user_name/Desktop/projects/olake/drivers/mongodb/config/config.json

2. Sync

./build.sh driver-mongodb sync --config /Users/user_name/Desktop/projects/olake/drivers/mongodb/config/config.json --catalog /Users/user_name/Desktop/projects/olake/drivers/mongodb/config/catalog.json --destination /Users/user_name/Desktop/projects/olake/drivers/mongodb/config/writer.json

Sample output syncing our sample reddit data and writing locally in parquet format:

docker compose 4

3. Sync with state

./build.sh driver-mongodb sync --config /Users/user_name/Desktop/projects/olake/drivers/mongodb/config/config.json --catalog /Users/user_name/Desktop/projects/olake/drivers/mongodb/config/catalog.json --destination /Users/user_name/Desktop/projects/olake/drivers/mongodb/config/writer.json --state /Users/user_name/Desktop/projects/olake/drivers/mongodb/config/state.json

Troubleshooting:

If you encounter any issues:

  • Verify the Docker network olake-test exists.
  • Check container logs using docker logs mongodb-db-1.
  • Ensure that the MongoDB key file permissions are correctly set by the entrypoint script.

Need Assistance?

If you have any questions or uncertainties about setting up OLake, contributing to the project, or troubleshooting any issues, we’re here to help. You can:

  • Email Support: Reach out to our team at hello@olake.io for prompt assistance.
  • Join our Slack Community: where we discuss future roadmaps, discuss bugs, help folks to debug issues they are facing and more.
  • Schedule a Call: If you prefer a one-on-one conversation, schedule a call with our CTO and team.

Your success with OLake is our priority. Don’t hesitate to contact us if you need any help or further clarification!