config.json
- MongoDB
- Postgres
- MySQL
Source Configuration
Below is a sample config.json
for connecting to a MongoDB replica set. Customize each field to match your environment.
{
"hosts": [
"host1:27017",
"host2:27017",
"host3:27017"
],
"username": "test",
"password": "test",
"authdb": "admin",
"replica-set": "rs0",
"read-preference": "secondaryPreferred",
"srv": true,
"server-ram": 16,
"database": "database",
"max_threads": 50,
"default_mode": "cdc",
"backoff_retry_count": 2,
"partition_strategy":""
}
Description of above parameters
Field | Description | Example Value | Data Type |
---|---|---|---|
hosts | List of MongoDB hosts. Use DNS SRV if srv = true . | x.xxx.xxx.120:27017 , x.xxx.xxx.120:27017 , x.xxx.xxx.133:27017 (can be multiple) | []STRING |
username/password | Credentials for MongoDB authentication. | "test"/"test" | STRING |
authdb | Authentication database (often admin ). | "admin" | STRING |
replica-set | Name of the replica set, if applicable. | "rs0" | STRING |
read-preference | Which node to read from (e.g., secondaryPreferred ). | "secondaryPreferred" | STRING |
srv | If using DNS SRV connection strings, set to true . When true , there can only be 1 host in hosts field. | true , false | BOOL |
server-ram | Memory management hint for the OLake container. | 16 | UINT |
database | The MongoDB database name to replicate. | "database_name" | STRING |
max_threads | Maximum parallel threads for chunk-based snapshotting. | 50 | INT |
default_mode | Default sync mode ("cdc" or "full_refresh" ). | "cdc" , "full_refresh" , "incremental" (WIP) | |
backoff_retry_count | Retries attempt to establish sync again if it fails, increases exponentially ( in minutes - 1, 2,4,8,16... depending upon the backoff_retry_count value) | defaults to 3, takes default value if set to -1 | INT |
partition_strategy | The partition strategy for backfill | timestamp , default uses Split-Vector Strategy if left empty |
Refer here for more about sync modes.
Get more information about MongoDB Source Configuration here.
Source Configuration
Below is a sample config.json
for connecting to a Postgres. Customize each field to match your environment.
{
"host": "localhost",
"port": 5432,
"database": "main",
"username": "main",
"password": "password",
"jdbc_url_params": {},
"ssl": {
"mode": "disable"
},
"update_method": {
"replication_slot": "postgres_slot",
"intial_wait_time": 10
},
"reader_batch_size": 100000,
"default_mode": "cdc",
"max_threads": 50
}
Description of above parameters
Field | Description | Example Value | Data Type |
---|---|---|---|
host | The hostname or IP address of the database server. | localhost | String |
port | The port number through which the database server is accessible. | 5432 | Integer |
database | The name of the target database to connect to. | main | String |
username | The username used for authenticating with the database. | main | String |
password | The password corresponding to the provided username for authentication. | password | String |
jdbc_url_params | A collection of additional JDBC URL parameters to fine-tune the connection. | {} | Object |
ssl | SSL configuration for the database connection. Contains details such as the SSL mode. | {"mode": "disable"} | Object |
update_method | Specifies the mechanism for updating data. Includes properties for a replication slot and an initial wait time. | {"replication_slot": "postgres_slot", "intial_wait_time": 10} | Object |
reader_batch_size | The maximum number of records processed per batch during reading operations. | 100000 | Integer |
default_mode | Defines the default mode of operation, for example, using CDC (Change Data Capture). | cdc | String |
max_threads | The maximum number of threads allocated for parallel processing tasks. | 50 | Integer |
Get more information about Postgres Source Configuration here.
Source Configuration
Below is a sample config.json
for connecting to a MySQL. Customize each field to match your environment.
{
"hosts": "localhost",
"username": "root",
"password": "password",
"database": "main",
"port": 3306,
"tls_skip_verify": true,
"default_mode": "cdc",
"max_threads": 10,
"backoff_retry_count": 2
}
Description of above parameters
Field | Description | Example Value | Data Type |
---|---|---|---|
hosts | List of database host addresses to connect to. | "localhost" | STRING |
username | Username for authenticating with the database. | "root" | STRING |
password | Password for the database user. | "password" | STRING |
database | Name of the target database to use. | "main" | STRING |
port | Port number on which the database server is listening. | 3306 | INT |
tls_skip_verify | Indicates whether to skip TLS certificate verification for secure connections. | true | BOOL |
default_mode | Default synchronization mode (e.g., change data capture, abbreviated as "cdc" ). | "cdc" | STRING |
max_threads | Maximum number of parallel threads allowed for processing or syncing data. | 10 | INT |
backoff_retry_count | Number of retry attempts for establishing sync, using an exponential backoff strategy upon failures. | 2 | INT |
Get more information about MySQL Source Configuration here.