Overview
The OLake MySQL Source connector supports multiple sync modes. It also offers features like parallel chunking, checkpointing, and automatic resume for failed full loads. This connector can be used within the OLake UI or run locally via Docker for open-source workflows.
Sync Modes Supportedβ
- Full Refresh
- Full Refresh + CDC
- CDC Only
- Full Refresh + Incremental
Connection Prerequisitesβ
Version Prerequisitesβ
MySQL Version: MySQL 5.7+
CDC Prerequisiteβ
-
Binary Logging (Required):
- log_bin=ON
- binlog_format=ROW
- binlog_row_image=FULL
- binlog_row_metadata=FULL
-
Server Config: Set unique server-id.
-
Access & Permissions: REPLICATION SLAVE, REPLICATION CLIENT, and SELECT privileges.
-
For detailed setup for different MySQL environments, see the guides in the section below.
Connection Prerequisitesβ
- Read access to the tables for the MySQL user.
After initial Prerequisites are fulfilled, the configurations for MySQL can be configured.
Configurationβ
- Use Olake UI for MySQL
- Use OLake CLI for MySQL
1. Navigate to the Source Configuration Pageβ
- Complete the OLake UI Setup Guide
- After logging in to the OLake UI, select the
Sources
tab from the left sidebar. - Click
Create Source
on the top right corner. - Select MySQL from the connector dropdown
- Provide a name for this source.
2. Provide Configuration Detailsβ
-
Enter MySQL credentials.
Field | Description | Example Value |
---|---|---|
MySQL Host required | List of database host addresses to connect to. | mysql-host |
Username required | Username for authenticating with the database. | mysql-user |
Password required | Password for the database user. | mysqlpwd |
Database | Name of the target database to use. | mysql-db |
Port required | Port number on which the database server is listening. | 3306 |
Update Method |
| CDC |
Initial Wait Time | Maximum duration in seconds to wait before considering the binlog syncer idle. | 0 |
Skip TLS Verification | Indicates whether to skip TLS certificate verification. | false |
Max Threads | Maximum number of parallel threads for processing or syncing data. | 3 |
Backoff Retry Count | Number of retry attempts for establishing sync with exponential backoff. | 3 |
3. Test Connectionβ
-
Once the connection is validated, the MySQL source is created. Jobs can then be configured using this source.
-
In case of connection failure, refer to the Troubleshooting section.
1. Create Configuration Fileβ
- Once the OLake CLI is set up, create a folder to store configuration files such as
source.json
anddestination.json
.
The source.json
file for MySQL must contain these mandatory fields.
2. Provide Configuration Detailsβ
An example source.json
file will look like this:
{
"hosts": "mysql-host",
"username": "mysql-user",
"password": "mysqlpwd",
"database": "my-db",
"port": 3306,
"tls_skip_verify": true,
"update_method": {
"initial_wait_time": 10
},
"max_threads": 5,
"backoff_retry_count": 4
}
```
Field | Description | Type | Example Value |
---|---|---|---|
hosts required | List of database host addresses to connect to. | string | "mysql-host" |
username required | Username for authenticating with the database. | string | "mysql-user" |
password required | Password for the database user. | string | "mysqlpwd" |
database | Name of the target database to use. | string | "my-db" |
port required | Port number on which the database server is listening. | integer | 3306 |
update_method required for CDC | Required for CDC sync configuration | object | {"initial_wait_time": 120} |
initial_wait_time required for CDC | Maximum duration in seconds to wait before considering the binlog syncer idle. | integer | 10 |
tls_skip_verify | Indicates whether to skip TLS certificate verification. | bool | false |
max_threads | Maximum number of parallel threads for processing or syncing data. | integer | 3 |
backoff_retry_count | Number of retry attempts for establishing sync with exponential backoff. | integer | 3 |
Similarly, destination.json
file can be created inside this folder. For more information, see destination documentation.
3. Check Source Connectionβ
To verify the database connection following command needs to be run:
docker run --pull=always \
-v "[PATH_OF_CONFIG_FOLDER]:/mnt/config" \
olakego/source-mysql:latest \
check \
--config /mnt/config/source.json
-
If OLake is able to connect with MySQL
{"connectionStatus":{"status":"SUCCEEDED"},"type":"CONNECTION_STATUS"}
response is returned. -
In case of connection failure, refer to the Troubleshooting section.
Data Type Mappingβ
MySQL Data Types | Destination Data Type |
---|---|
int, int unsigned, mediumint, mediumint unsigned, smallint, smallint unsigned, tinyint, tinyint unsigned | int |
bigint, bigint unsigned | bigint |
float, decimal(10,2) | float |
double, double precision, real | double |
datetime, timestamp | timestamptz |
char, varchar, text, tinytext, mediumtext, longtext, enum, json, bit(1), time | string |
OLake always ingests timestamp data in UTC format, independent of the source timezone.
Troubleshootingβ
1. Failed to Get Current Binlog Positionβ
Cause: Binary logging not enabled, wrong format, or insufficient privileges.
Fix:
- Ensure binary logging is enabled:
SHOW VARIABLES LIKE 'log_bin'; // Should return `ON`
- Ensure row-based logging:
SHOW VARIABLES LIKE 'binlog_format'; // Should return `ROW`.
- Grant required privileges:
GRANT REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'your_user'@'%';
2. Idle Timeout Reached, Exiting Binlog Syncerβ
- Cause: No new changes within the configured
initial_wait_time
. - Fix: Increase
initial_wait_time
in the connector configuration or verify data changes in the source database.
3. Column Count Mismatch: Expected X, Got Yβ
- Cause: Table schema changed after CDC started or incomplete binlog metadata.
- Fix:
- Ensure full binlog metadata:
Should returnSHOW VARIABLES LIKE 'binlog_row_metadata';
FULL
. To set it:UpdateSET GLOBAL binlog_row_metadata = 'FULL';
my.cnf
ormy.ini
for persistence.- Restart the connector after schema changes.
4. Failed to Get or Split Chunksβ
-
Cause: Table stats not populated or table contains 0 records.
-
Fix: Run the following query to populate table statistics:
ANALYZE TABLE <namespace>.<table_name>;
If the issue is not listed here, post the query on Slack to get it resolved within a few hours.
Changelogβ
Date of Release | Version | Description |
---|---|---|
27 August 2025 | 0.1.11 | override default timeout in Discover |