destination.json
Before proceeding with either configuration, please ensure you have completed the getting started instructions for the database you want to connect from. For more background details, refer to the README section.
- Iceberg
- S3
- Local
Apache Iceberg​
The Iceberg Writer syncs data from databases (MySQL, MongoDB, PostgreSQL) into Apache Iceberg.
- AWS Glue
- JDBC Catalog
- REST Catalog
- Hive Catalog
{
"type": "ICEBERG",
"writer": {
"catalog_type": "glue",
"iceberg_s3_path": "s3://<BUCKET_NAME>/<S3_PREFIX_VALUE>",
"aws_region": "ap-south-1",
"aws_access_key": "XXX",
"aws_secret_key": "XXX",
"iceberg_db": "ICEBERG_DATABASE_NAME",
"grpc_port": 50051,
"server_host": "localhost"
}
}
{
"type": "ICEBERG",
"writer": {
"catalog_type": "jdbc",
"jdbc_url": "jdbc:postgresql://host.docker.internal:5432/iceberg",
"jdbc_username": "iceberg",
"jdbc_password": "password",
"iceberg_s3_path": "s3a://warehouse",
"s3_endpoint": "http://host.docker.internal:9000",
"s3_use_ssl": false,
"s3_path_style": true,
"aws_access_key": "admin",
"aws_region": "ap-south-1",
"aws_secret_key": "password",
"iceberg_db": "ICEBERG_DATABASE_NAME"
}
}
{
"type": "ICEBERG",
"writer": {
"catalog_type": "rest",
"rest_catalog_url": "http://localhost:8181/catalog",
"iceberg_s3_path": "warehouse",
"iceberg_db": "ICEBERG_DATABASE_NAME"
}
}
{
"type": "ICEBERG",
"writer": {
"catalog_type": "hive",
"iceberg_s3_path": "s3a://warehouse/",
"aws_region": "us-east-1",
"aws_access_key": "admin",
"aws_secret_key": "password",
"s3_endpoint": "http://localhost:9000",
"hive_uri": "thrift://localhost:9083",
"s3_use_ssl": false,
"s3_path_style": true,
"hive_clients": 5,
"hive_sasl_enabled": false,
"iceberg_db": "ICEBERG_DATABASE_NAME"
}
}
For sample configuration and other details regarding Apache Iceberg, refer to Iceberg writer docs.
S3 Writer​
OLake’s Parquet S3 writer allows you to write your data directly into an Amazon S3 bucket in Parquet format. This mode is ideal for users who want to leverage S3’s scalable storage for their data outputs.
{
"type": "PARQUET",
"writer": {
"s3_bucket": "olake-s3-test",
"s3_region": "ap-south-1",
"s3_access_key": "xxxxxxxxxxxxxxxxxxxx",
"s3_secret_key": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"s3_path": "/data"
}
}
Key | Description | Data Type | Probable Values |
---|---|---|---|
type | Specifies the output file format for writing data. Currently, only the "PARQUET" format is supported. | string | "PARQUET" |
s3_bucket | The name of the Amazon S3/GCS bucket (without s3:// or gs:// ) where your output files will be stored. Ensure that the bucket exists and that you have proper access. | string | A valid S3 bucket name (e.g. "olake-s3-test" ) |
s3_region | The AWS/GCS region where the specified S3 bucket is hosted. | string | AWS/GCS region codes such as "us-west-2" , "ap-south-1" , etc. |
s3_access_key | The AWS/GCS HMAC access key used for authenticating S3 requests. | string | A valid AWS/GCS HMAC access key |
s3_secret_key | The AWS/GCS HMAC secret key used for S3 authentication. This key should be kept secure. | string | A valid AWS/GCS HMAC access key |
s3_endpoint | (Optional) The custom endpoint for S3-compatible services. Required for GCS using HMAC keys. | string | "https://storage.googleapis.com" |
s3_path | (Optional) The specific path (or prefix) within the S3 bucket where data files will be written. This is typically a folder path that starts with a / (e.g. "/data" ). | string | A valid path string |
For sample configuration and other details regarding S3, refer to S3 writer docs.
Local Parquet Writer​
The local writer mode is used to write Parquet files directly to a local directory inside your Docker container. The local directory is mapped to your host file system via a Docker volume. To run OLake via docker, follow getting started guide.
- Using Dockerized OLake
- Build OLake
Sample Configuration​
{
"type": "PARQUET",
"writer": {
"local_path": "/mnt/config"
}
}
Configuration Key Details​
Key | Data Type | Example Value | Description & Possible Values |
---|---|---|---|
type | string | "PARQUET" | Specifies the output file format. Currently, only the Parquet format is supported. |
writer.local_path | string | "/mnt/config" | The local directory inside the Docker container where Parquet files will be stored. This path is mapped to your host file system via a Docker volume. |
Note: This configuration enables the Parquet local writer. For more details, check out the README section.
Sample Configuration​
{
"type": "PARQUET",
"writer": {
"local_path": "./mnt/config"
}
}
Key | Data Type | Example Value | Description & Possible Values |
---|---|---|---|
type | string | "PARQUET" | Specifies the output file format. Currently, only the Parquet format is supported. |
writer.local_path | string | "/mnt/config" | The local directory inside the Docker container where Parquet files will be stored. This path is mapped to your host file system via a Docker volume. |
Note: This configuration enables the Parquet local writer. For more details, check out the README section.
For sample configuration and other details regarding local writer, refer to S3 writer docs.